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S5 (54) Title: NOVEL HCV NON-STRUCTURAL POLYPEPTIDE 

(57) Abstract: Polypeptides comprising a mutant non-structural Hepatitis C virus useful in diagnostic and/or immunogenic 
^ compositions are disclosed, in which the mutant is an N-terminal mutation that functionally disrupt the catalytic domain of NS3. 
^ Polynucleotides encoding these polypeptides, host cells transformed with polynucleotides and methods of using the polypeptides 
^ and polynucleotides arc also disclosed. 
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5 NOVEL HCV NON-STRUCTURAL POLYPEPTIDE 

FIELD OF THE INVENTION 

The present invention relates to polypeptides comprising a mutant non> 
10 structural Hepatitis C virus ("HCV") polypeptide useful for immunogenic compounds 
for use against HCV, methods of preparing and using the same, and immunogenic 
compositions comprising the same. The present invention also relates to compositions 
comprising (a) a mutant non-structural HCV polypeptide and (b) a viral polypeptide 
that is not a non-structural HCV polypeptide and methods of using these compositions. 

15 

BACKGROUND OF THE INVENTION 

HCV is now recognized as the major agent of chronic hepatitis and liver disease 
worldwide. It is estimated that HCV infects about 400 million people worldwide, 
corresponding to more than 3% of the world population. 
20 Hepatitis C virus ("HCV") is a small enveloped RNA flavivirusy which contains 

a positive-stranded RNA genome of about 10 kilobases. The genome has a single 
uninterrupted ORF that encodes a protein of 3010-301 1 amino acids. The structural 
proteins of HCV include a core protein (C), which is highly immimogenic, as well as 
two envelope proteins (El and E2), which likely form a heterodimer in vivo, and non- 
25 structural proteins NS2-NS5. It is known that the NS3 region of the virus is important 
for post-translational processing of the polyprotein into individual proteins, and the 
NS5 region encodes an RNA-dependant RNA polymerase. 

Virus-specific T lymphocytes, along with neutralizing antibodies, are the 
mainstay of the antiviral immune defense in established viral infections. Whereas 
30 CDS"^ cytotoxic T cells eliminate virus-infected-cells, CD4* T helper cells are essentiW 
for the efficient regulation of the antiviral immune response. CD4* T helper cells 
recognize specific antigens as peptides boimd to autologous HLA class II molecules 
(viral antigens or particles are taken up by professional antigen-presenting cells, 
processed to peptides, bound to HLA class n molecules in the lysosomal compartment. 
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and transported back to the cell surface). Several observations support an important 
role of CD4^ T cells in the elimination of HCV infection. Tsai et a/., 1997 Hepatology 
25:449-458; Diepolder et al 1995 Lancet 346: 1—6-1009; Missale et al 1996 JCI 98: 
706-714; BotareUi et al 1993; Gastro 104: 580-587; Diepolder et al 1997 J.Virol 71 : 
5 601 1. Immunogenic peptides usually have a minimal length of 8-1 1 amino acids. 

However, since the peptide binding groove of HLA class II molecules seems to be open 
at both ends, longer peptides are tolerated. Thus peptides eluted from HLA class II 
molecules are typically in the range of 15-25 amino acids. HLA class II molecules are 
extremely polymorphic and each allele seems to have its individual requirements for 

10 peptide binding. Thus the HLA class II repertoire of a given individual determines 
which viral peptides can be presented to T cells. Recognition of the specific HLA- 
peptide complex by the T cell receptor accompanied by appropriate costimulatory 
signals lead to T cell activation, secretion of cytokines, and T cell proliferation. 

Numerous studies demonstrate that HLA Class n restricted CD4^ responses are 

1 5 determined by stimulating peripheral blood mononuclear cells with recombinant viral 
antigens or peptides. BotareUi et aiy (1993) Gastroenterology 104:580-587; Farrari et 
al, (1994) Hepatology 19:286-295; Minutello et al, (1993) C. J. Exp. Med. 178:17-25; 
Hoffmann et aL, (1995) Hepatology 21:632-638; Iwata et al, (1995) Hepatology 
22:1057-1064; and Tsai.e/ al, (1995) Hepatology 21:908-912. 

20 Polyclonal multispecific CYiZ^ T cell responses have been detected in patients 

with chronic hepatitis C. Additionally, CD8^ CTL's were shown to be important in 
resolving acute HCV infection in chimpanzees (Cooper et al. Immunity 1999). About 
50% of patients with chronic hepatitis C demonstrate a detectable virus-specific CD4'" 
T cell response, which is most frequently directed against HCV core and/or NS4 and 

25 tends to be more common in patients who achieve sustained viral clearance during 
interferon-a therapy. 

Depending on the pattern of lymphokines, CD4^ T helper cells have been 
classified as THl, THO, or TH2. Cytokines of the THl type are typically IFN-y, 
lymphotoxin, and interleukin-2 (IL-2), which are believed to support activation of 

30 virus-specific CDS"^ T cells and natural killer cells. The TH2 cytokines IL-4, IL-5, IL- 
10, and IL- 13 are important for B cell activation and differentiation, thus inducing a 
humoral immune response. 
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During acute hepatitis C infection a strong and sustained THl/THO response to 
NS3 and possibly to other nonstructural proteins is associated with a self-limited course 
of the disease. Diapolder et ai, (1995) Lancet 346:1006-1007, showed all CD^" T cell 
clones to have a THl or THO cytokine profile, suggesting that the clones support 
5 cytotoxic immxine mechanisms in vivo. The majority of CD4^ T cell clones responded 
to a relatively short segment of NS3, namely amino acids 1207-1278, suggesting that 
this region of NS3 is immunodominant for €04"^ T cells. More than 70% of those 
who contract HCV develop chronic infection and hepatitis, and a significant portion of 
them progress to cirrhosis and eventually hepatocellular carcinoma. The only q)proved 
10 therapy at present is a 6- to 12- month course of interferon a, which leads to sustained 
improvement in only 20% of patients. So far, no commercial vaccine is available. 

Thus, there remains a need for compositions and methods capable of promoting 
ahti-HCV responses. 



15 SUMMARY OF THE INVENTION 

In one aspect, the present invention relates to isolated polypeptides comprising 
mutant hepatitis C ("HCV") polypeptides comprising at least portions of NS3, NS4, 
and NSS. In a preferred aspect, NS3 is encoded by a nucleic acid sequence having an 
N-terminal deletion to remove the catalytic domain. The NS mutant polypeptides can 

20 include NS3, NS4s, NS4b, NS5a, NS5b or portions thereof For example, in various 
embodiments, the mutant NS polypeptide comprises NS3, NS4 (NS4a and NS4b) and 
NS5 (NS5a and NS5b). In other embodiments, the NS polypeptide consists of NS3 and 
NS4 (for example, NS4a and/or NS4b) or NS3 and NSS (for example, NS5a and/or 
NS5b). Other combinations of full-length or fragments of non-structural components 

25 are also contemplated. 

In another preferred aspect, the polypeptides further comprise a viral 
polypeptide that is not a non-structural HCV polypeptide. Such polypeptides are 
preferably C, or antigenic fi-agments thereof, more preferably, truncated C of HCV. 
Other polypeptides are preferably E, or antigenic Augments thereof, more preferably, 

30 El or E2 of HCV. Such polypeptides need not be encoded by a natural HCV genome, 
and include, for example, truncated or otherwise mutant HCV polypeptides or 
polypeptides derived from other genomes, such as, for example, polypeptides of HBV. 
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Thus, the invention includes an isolated mutant non-structural ("NS") HCV polypeptide 
comprising a polypeptide having a mutation in the catalytic domain of NS3 that 
functionally disrupts the catalytic domain. The mutation can be, for example, a 
deletion or a substitution mutation, hi certain embodiments, the mutant NS polypeptide 
5 comprises NS3, NS4 and NS5. In other embodiments, the mutant NS polypeptides 
described herein further comprise a second viral polypeptide that is not NS3, NS4, or 
NS5 of HCV, for example an HCV Core polypeptide ("C"), or fragment thereof, or an 
HCV envelope protein ("E"), for example El and/or E2. In certain embodiments, C is 
truncated {eg,, at amino acid 121). 

10 In another aspect, the present invention relates to compositions comprising any 

of the mutant hepatitis C CHCV*') polypeptides described herein, for example 
polypeptides comprising at least portions of NS3, NS4, and NS5. In a preferred aspect, 
NS3 is encoded by a nucleic acid sequence having an N-temiinal deletion to disrupt the 
function of the catalytic domain, for example by removing this domain. In another 

1 5 preferred aspect, the polypeptides further comprise a viral polypeptide that is not a non- 
structural HCV polypeptide. Such polypeptides are preferably C, or antigenic 
fragments thereof, more preferably, truncated C of HCV. Other polypeptides are 
preferably E, or antigenic fragments thereof, more preferably, El or E2 of HCV Such 
polypeptides need not be encoded by a natural HCV genome, and include, for example, 

20 truncated or otherwise mutant HCV polypeptides or polypeptides derived from other 
genomes, such as, for example, polypeptides of HBV. In another aspect, the invention 
includes a composition comprising (a) any of the polypeptides described herein; and (b) 
a pharmaceutically acceptable excipient {e.g., carrier and/or adjuvant). 

In another aspect, the invention includes an isolated and purified polynucleotide 

25 which encodes any of the mutant HCV polypeptides described herein. In certain 

embodiments, the invention includes a composition comprising (a) the isolated purified 
polynucleotide encoding any of the mutant HCV polypeptides; and (b) a 
pharmaceutically acceptable excipient. The polynucleotide, can be for example, DNA 
in a plasmid, or is in a plasmid. Additionally, the polynucleotides described herein may 

30 be included in an expression vector as shown in the attached Figures and Sequence 
Listings. 
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In another aspect, the present invention relates to host cells transformed with 
expression vectors comprising a nucleic acid sequence encoding a mutant HCV 
polypeptide comprising at least portions of NS3, NS4, and NS5. In a preferred aspect, 
the expression vectors of the host cells further comprises at least one nucleic acid 
5 sequence encoding a viral polypeptide that is not a non-structural HCV polypeptide. 
Such polypeptides are preferably C, or antigenic fragments thereof, more preferably, 
truncated C of HCV. Other polypeptides are preferably E, or antigenic fragments 
thereof, more preferably. El or E2 of HCV. Such polypeptides need not be encoded 
by a natural HCV genome, and include, for example, truncated or otherwise mutant 

10 HCV polypeptides or polypeptides derived from other genomes, such as, for example, 
polypeptides of HBV. In another preferred aspect the nucleic acid sequences of the 
expression vectors are coexpressed. In yet another preferred aspect, the host cells are 
yeast cells or mammalian cells. 

In another aspect, the present invention relates to expression vectors comprising 

15 a nucleic acid sequence encoding a mutant HCV polypeptide comprising NS3, NS4, 
and NS5. In a preferred aspect, the expression vectors of the host cells frirther 
comprises at least one nucleic acid sequence encoding a viral polypeptide that is not a 
non-structural HCV polypeptide. Such polypeptides are preferably C, or antigenic 
fragments thereof, more preferably, truncated C of HCV. Other polypeptides are 

20 preferably E, or antigenic fragments thereof, more preferably, El or E2 of HCV. 

Importantly, such polypeptides need not be encoded by a natural HCV genome, such 
as, for example, truncated or otherwise mutant HCV polypeptides or polypeptides 
derived from other genomes, such as, for example, polypeptides of HBV. In another 
aspect, the present invention relates to methods of preparing a mutant HCV 

25 polypeptides. In a preferred aspect, the method comprises the steps of transforming a 
host cell with an expression vector, said vector comprising a nucleic acid sequence 
encoding a mutant HCV polypeptide comprising at least portions of NS3, NS4, and 
NS5, and isolating said polypeptide. In another preferred aspect the HCV polypeptide 
further comprises a viral polypeptide that is not a non-structural HCV polypeptide. 

30 Such polypeptides are preferably C, or antigenic fragments thereof, more preferably, 
truncated C of HCV. Other polypeptides are preferably E, or antigenic fragments 
thereof, more preferably, Elor E2 of HCV. Such polypeptides need not be encoded by 
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a natural HCV genome, and include, for example, truncated or otherwise mutant HC V 
polypeptides or polypeptides derived from other genomes, such as, for example, 
polypeptides of HBV. In another preferred aspect the host cells are yeast cells or 
mammalian cells. 

5 In another aspect, the present invention relates to antibodies which specifically 

bind to mutant HCV polypeptide comprising NS3, NS4, and NS5, and to methods of 
making and using the same. In a preferred aspect, the HCV polypeptide further 
comprises a viral polypeptide that is not a non-structural HCV polypeptide. Such 
polypeptides are preferably C, or antigenic fragments thereof, more preferably, 

10 truncated C of HCV. Other polypeptides are preferably E, or antigenic fragments 

thereof, more preferably. El or E2 of HCV. Such polypeptides need not be encoded 
by a natural HCV genome, such as, for example, truncated or otherwise mutant HCV 
polypeptides or polypeptides derived from other genomes, and include, for example, 
polypeptides of HBV. In another preferred aspect, the antibody is either monoclonal or 

IS polyclonal. 

In yet another aspect, a method of preparing a mutant NS HCV polypeptide, 
wherein the method comprises the steps of (a) transforming a host cell with any of the 
expression vectors described herein, under conditions wherein the polypeptide is 
expressed; and (b) isolating the polypeptide. The host cell can be, for example, a yeast 
20 cell, a manunalian cell a plant cell or an insect cell. The polypeptide can be expressed 
and isolated intracellularly or can be secreted and isolated from the surrounding 
environment. 

In a still fiirther aspect, a method of eliciting an immime response in a subject is 
provided. The immune response can be elicited by administering any of the 
25 polynucleotides and/or polypeptides described herein in one or multiple doses. 

These and other embodiments of the subject invention will readily occur to 
those of skill in the art in light of the disclosure herein. 

BRIEF DESCRIPTION OF THE FIGURES 
30 FIG. 1 shows the cloning scheme for generating pCMV-NS35. 
FIG. 2 shows the 962 Ibp vector pCMV-NS35. 
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FIG. 3 shows the nucleic acid sequence of pCMV-NS35 (SEQ ID N0:1), including the 
nucleic acid sequence of the NS35 ORF, and also the translation of NS35 (SEQ ED 
N0:2). 

FIG. 4 shows the 9621bp pCMV-delNS35. 
5 FIG. 5 shows the nucleic acid sequence of pCMV-delNS35 (SEQ ID N0:3), including 
the nucleic acid sequence of the delNS35 ORF, and also the translation of the delNS35 
polypeptide (SEQ ID N0:4). 
FIG. 6 shows the 4276bp pCMV-II. 

FIG. 7 shows the nucleic acid sequence of pCMV-II (SEQ ID N0:5). 
10 FIG. 8 shows the 6300bp pCMV-NS34A. 

FIG. 9 shows the nucleic acid sequence of pCMV-NS34A (SEQ ID N0:6), including 
the nucleic acid sequence of the NS34A ORF, and also the translation of NS34A (SEQ 
IDNO:7). 

FIG. 10 shows the cloning scheme for generating pd.ANS3NS5. 
15 FIG. 1 1 shows the nucleic and amino acid sequences of pd.ANS3NS5 (SEQ ID NO: 8 
and 9). 

FIG. 12 shows the Western blot of proteins expressed by S. cerevisiae strain AD3 
transformed with pd.ANS3NS5. 

FIG. 13 shows the cloning scheme for generating pd.ANS3NS5.pj. 
20 FIG. 14 shows the nucleic and amino acid sequences of pd.ANS3NS5.pj (SEQ ID 
NOrlOandll). 

FIG. 15 shows the Western blot of proteins expressed by S. cerevisiae strain AD3 
transformed with pd.ANS3NS5.pj, specifically demonstrating the expression of 
ANS3NS5 polypeptide. 
25 FIG. 16 shows the cloning scheme for generating pdANS3NS5.pj.corel21RT and 
pdANS3NS5.pj.corel73RT. 

FIG. 17 shows the nucleic and amino acid sequences of pd,ANS3NS5.pj.corel21 (SEQ 
IDN0:12and 13). 

FIG. 18 shows the nucleic and amino acid sequences of pd.ANS3NS5.pj.corel73 (SEQ 
30 ID NO: 14 and 15). 

FIG. 19 shows the Western blot of proteins expressed by S. cerevisiae strain AD3 
transformed with pd. ANS3NS5.pj, specifically demonstrating the expression of 
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ANS3NS5.corel21 and ANS3NS5.corel73 polypeptides. Lanes 1 and 7 show See 
Blue Standards. Lane 2 shows control yeast plasmid. Lanes 3 and 4 show 
ANS3NS5.corel21RT polypeptide, colonies 1 and 2. Lanes 5 and 6 show 
ANS3NS5.corel 73RT polypeptide, colonies 3 and 4. 
5 FIG. 20 shows the cloning scheme for generating pdANS3NS5.pj.corel40RT and 
pdANS3NS5.pj.corel50RT. 

FIG. 21 shows the nucleic and amino acid sequences of pd.ANS3NSS.pj.corel40 (SEQ 
IDNO:16andl7). 

FIG. 22 shows the nucleic and amino acid sequences of pd.ANS3NS5.pj.corel50 (SEQ 

10 IDNO:18andl9). 

FIG. 23 shows the Western blot of proteins expressed by S. cerevisiae strain AD3 
transformed with pd.ANS3NS5.pj, specifically demonstrating the expression of 
ANS3NS5corel40 and ANS3NS5corel50 polypeptides. Lane 1 shows See Blue 
Standards. Lanes 2 and 3 show ANS3NSSScorel40RT polypeptide, colonies 5 and 6. 

IS Lanes 4 and S show ANS3NSScorelS0RT polypeptide, colonies 7 and 8. Lane 6 shows 
control yeast plasmid. Lane 7 shows ANS3NS5corel21RT polypeptide, colony 1. 
Lane 8 shows ANS3NS5corel73RT polypeptide, colony 5. 

DETAILED DESCRIPTION OF THE INVENTION 

20 The practice of the present invention will employ, unless otherwise indicated, 

conventional techniques of molecular biology, microbiology, recombinant DNA 
techniques, and immunology, which are within the skill of the art. Such techniques are 
explained fully in the literature. See e.g., Sambrook, et al., MOLECULAR CLONING; 
A LABORATORY MANUAL (1989); DNA CLONING, VOLUMES I AND II (D. N. 

25 Glover ed. 1985); OLIGONUCLEOTIDE SYNTHESIS (M. J. Gait ed., 1984); 
NUCLEIC ACID HYBRIDIZATION (B. D. Hames & S. J. Higgins eds. 1984); 
TRANSCRIPTION AND TRANSLATION (B. D. Hames & S. J. Higgins eds. 1984); 
ANIMAL CELL CULTURE (R. I. Freshney ed. 1986); IMMOBILIZED CELLS AND 
ENZYMES (IRL Press, 1986); B. Perbal, A PRACTICAL GUIDE TO MOLECULAR 

30 CLONING (1984); the series, METHODS OF ENZYMOLOGY (Academic Press, 

Inc.); GENE TRANSFER VECTORS FOR MAMMALL\N CELLS (J. H. Miller and 
M. P. Calos eds. 1987, Cold Springs Harbor Laboratory), Methods in Enzymology Vol 
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154 and Vol. 155 (Wu and Grossman, and Wu, eds., respectively); Mayer and Walker 
eds. (1987), IMMUNOfflSTOCHEMICAL METHODS IN CELL AND 
MOLECULAR BIOLOGY (Academic Press, London); Scopes, (1987), PROTEIN 
PURIFICATION: PRINCIPALS AND PRACTICE, Second Edition (Springer- Verlag, 
5 New York); and HANDBOOK OF EXPERIMENTAL IMMUNOLOGY, VOLUMES 
I-IV (D. M. Weir and C. C. Blackwell eds. 1986). 

It must be noted that, as used in this specification and the appended claims, the 
singular forms "a", **an" and "the" include plural referents unless the content clearly 
dictates otherwise. Thus, for example, reference to "an antigen" includes a mixture of 
10 two or more antigens, and the like. 

1. Definitions 

In describing the present invention, the following terms will be employed, and 
are intended to be defined as indicated below. 

1 5 The term "hepatitis C virus" (HCV) refers to an agent causative of Non-A, Non- 

B Hepatitis (NANBH). The nucleic acid sequence and putative amino acid sequence of 
HCV is described in U.S. Patent Nos. 5,856,437 and 5,350,671 . The disease caused by 
HCV is called hepatitis C, formerly called NANBH. The term HCV, as used herein, 
denotes a viral species of which pathenogenic strains cause NANBH, as well as 

20 attenuated strains or defective interfering particles derived therefi-om. 

HCV is a member of the viral family flaviviridae. The morphology and 
composition of Flavivirus particles are known, and are discussed in Reed et al, Curr. 
Stud. Hematol Blood Transjus, (1998), 62:1-37; HEPATITIS C VIRUSES IN FIELDS 
VIROLOGY (B.N. Fields, D.M. Knipe, P.M. Howley, eds.) (3d ed. 1996). It has 

25 recently been found that portions of the HCV genome are also homologous to 

pestiviruses. Generally, with respect to morphology, Flaviviruses contain a central 
nucleocapsid surrounded by a lipid bilayer. Virions are spherical and have a diameter 
of about 40-50 nm. Their cores are about 25-30 nm in diameter. Along the outer 
surface of the virion envelope are projections that are about 5-10 nm long with terminal 

30 knobs about 2 nm in diameter. 

The HCV genome is comprised of RNA. It is known that RNA containing 
viruses have relatively high rates of spontaneous mutation. Therefore, there can be 
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multiple strains, which can be virulent or avirulent, within the HCV class or species. 
The ORF of HCV, including the translation spans of the core, non-structural, and 
envelope proteins, is shown in U.S. Patent Nos. 5,856,437 and 5,350,671. 

The terms "polypeptide" and "protein" refer to a polymer of amino acid 
5 residues and are not limited to a minimum length of the product. Thus, peptides, 

oligopeptides, dimers, multimers, and the like, are included within the definition. Both 
full-length proteins and fragments thereof are encompassed by the definition. The 
terms also include postexpression modifications of the polypeptide, for example, 
glycosylation, acetylation, phosphorylation and the like. Furthermore, for purposes of 

10 the present invention, a "polypeptide" refers to a protein which includes modifications, 
such as deletions, additions and substitutions (generally conservative in nature), to the 
native sequence, so long as the protein maintains the desired activity. These 
modifications may be deliberate, as through site-directed mutagenesis, or may be 
accidental, such as through mutations of hosts which produce the proteins or errors due 

15 to PCR amplification. 

An HCV polypeptide is a polypeptide, as defined above, derived from the HCV 
polyprotein. The polypeptide need not be physically derived from HCV, but may be 
synthetically or recombinantly produced. Moreover, the polypeptide may be derived 
from any of the various HCV strains, such as from strains 1 , 2, 3 or 4 of HCV. A 

20 number of conserved and variable regions are known between these strains and, in 
general, the amino acid sequences of epitopes derived from these regions will have a 
high degree of sequence homology, e.g., amino acid sequence homology of more than 
30%, preferably more than 40%, when the two sequences are aligned and homology 
determined by any of the programs or algorithms described herein. Thus, for example, 

25 the term **NS4" polypeptide refers to native NS4 from any of the various HCV strains, 
as well as NS4 analogs, muteins and immunogenic fragments, as defined fiirther below. 

Further, the tenns "ANS35," "delNS35," "ANS3NS5," and "ANS3-5" as used 
herein refer to a mutant polypeptide, comprising at least portions of NS3, NS4, or NS5, 
comprising a deletion in, or mutation of, the NS3 protease active site region to render 

30 the protease non-fimctional. In one embodiment, ANS3-5 comprises amino acids 1242- 
301 1 , as shown in FIG. 5, or polypeptides substantially homologous thereto. It will be 
readily apparent to one of ordinary skill in the art how to determine that NS3 protease 
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has been rendered non-fimctional. If the protease is functional, one will obtain protein 
of the expected molecular weight upon expression. As set forth in Example 2 and 
Figure 15, using SDS-page, 4-20%, a protein having a molecular weight of 
approximately 194kD was obtained when strain AD3 was transformed with 
5 pd.ANS3NS5.PJ clone #5. One skilled in the art could readily determine whether a 
protein of the desired molecular weight was expressed for any given deletion or 
mutation. 

The terms "analog*' and "mutein" refer to biologically active derivatives of the 
reference molecule, or fragments of such derivatives, that retain desired activity, such 

10 as the ability to stimulate a cell-mediated immune response, as defined below. In 

general, the term "analog" refers to compounds having a native polypeptide sequence 
and structure with one or more amino acid additions, substitutions (generally 
conservative in nature) and/or deletions, relative to the native molecule, so long as the 
modifications do not destroy immunogenic activity. The term *'mutein'* refers to 

1 5 peptides having one or more peptide mimics ("peptoids^Of such as those described in 
International Publication No. WO 91/04282. Preferably, the analog or mutein has at 
least the same inununoactivity as the native molecule. Methods for making 
polypeptide analogs and muteins are known in the art and are described further below. 
Particularly preferred analogs include substitutions that are conservative in 

20 nature, i.e., those substitutions that take place within a family of amino acids that are 
related in their side chains. Specifically, amino acids are generally divided into foxu: 
families: (1) acidic - aspartate and glutamate; (2) basic - lysine, arginine, histidine; 
(3) non-polar — alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, 
tryptophan; and (4) uncharged polar - glycine, asparagine, glutamine, cysteine, serine 

25 threonine, tyrosine. Phenylalanine, tryptophan, and tyrosine are sometimes classified 
as aromatic amino acids. For example, it is reasonably predictable that an isolated 
replacement of leucine with isoleucine or valine, an aspartate with a glutamate, a 
threonine with a serine, or a similar conservative replacement of an amino acid with a 
structurally related amino acid, will not have a major effect on the biological activity. 

30 For example, the polypeptide of interest may include up to about 5- 1 0 conservative or 
non-conservative amino acid substitutions, or even up to about 15-25 conservative or 
non-conservative amino acid substitutions, or any integer between 5-25, so long as the 
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desired function of the molecule remains intact. One of skill in the art may readily 
determine regions of the molecule of interest that can tolerate change by reference to 
HoppAVoods and Kyte-Doolittle plots, well known in the art. 

By "fragment" is intended a polypeptide consisting of only a part of the intact 
5 full-length polypeptide sequence and structure. The fragment can include a C-terminal 
deletion and/or an N-tenninal deletion of the native polypeptide. An ''immunogenic 
fragment" of a particular HCV protein will generally include at least about 5-10 
contiguous amino acid residues of the full-length molecule, preferably at least about 
15-25 contiguous amino acid residues of the full-length molecule, and most preferably 

10 at least about 20-50 or more contiguous amino acid residues of the full-length 

molecule, that define an epitope, or any integer between 5 amino acids and the full- 
length sequence, provided that the fragment in question retains immunogenic activity, 
as measured by the assays described herein. For a description of various HCV 
epitopes, see, e.g., Chien et al., Proc. Natl Acad. ScL USA (1992) 89:1001 1-10015; . 

15 Chien et al., J. Gastroent Hepatol (1993) 8:S33-39; Chien et al.. International 
Publication No. WO 93/00365; Chien, D.Y., International Publication No. WO 
94/01778; commonly owned, allowed U.S. Patent Application Serial Nos. 08/403,590 
and 08/444,818. 

The term "epitope" as used herein refers to a sequence of at least about 3 to 5, 
20 preferably about 5 to 10 or 15, and not more than about 1,000 amino acids (or any 
integer therebetween), which defme a sequence that by itself or as part of a larger 
sequence, binds to an antibody generated in response to such sequence. There is no 
critical upper limit to the length of the fragment, which may comprise nearly the full- 
length of the protein sequence, or even a fusion protein comprising two or more 
25 epitopes from the HCV polyprotein. An epitope for use in the subject invention is not 
limited to a polypeptide having the exact sequence of the portion of the parent protein 
from which it is derived. Indeed, viral genomes are in a state of constant flux and 
contain several variable domains which exhibit relatively high degrees of variability 
between isolates. Thus the term "epitope" encompasses sequences identical to the 
30 native sequence, as well as modifications to the native sequence, such as deletions, 
additions and substitutions (generally conservative in nature). 
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Regions of a given polypeptide that include an epitope can be identified using 
any number of epitope mapping techniques, well known in the art. See, e.g.. Epitope 
Mapping Protocols in Methods in Molecular Biology, Vol. 66 (Glenn E. Morris, Ed., 
1996) Humana Press, Totowa, New Jersey. For example, linear epitopes may be 
5 determined by e.g., concurrently synthesizing large numbers of peptides on sohd 
supports, the peptides corresponding to portions of the protein molecule, and reacting 
the peptides with antibodies while the peptides are still attached to the supports. Such 
techniques are known in the art and described in, e.g., U.S. Patent No. 4,708,871; 
Geysenetal.(1984)Prac. Natl. Acad. ScL f/&4 81:3998-4002; Geysen et al. (1986) 

10 Molec. Immunol 23:709-715. Similarly, conformational epitopes are readily 

identified by determining spatial conformation of amino acids such as by, e.g., x-ray 
crystallography and 2-dimensional nuclear magnetic resonance. See, e.g., Epitope 
Mapping Protocols^ supra. Antigenic regions of proteins can also be identified using 
standard antigenicity and hydropathy plots, such as those calculated using, e.g., the 

IS Omiga version 1 .0 software program available fi:om the Oxford Molecular Group. This 
computer program employs the Hopp/Woods method, Hopp et al., Proc. Natl Acad, 
Sci USA (1981) 28:3824-3828 for detemining antigenicity profiles, and the Kyte- 
Doolittle technique, Kyte et al, J, Mol Biol (1982) 157:105-132 for hydropathy plots. 
As used herein, the term "conformational epitope" refers to a portion of a fiill- 

20 length protein, or an analog or mutein thereof, having structural features native to the 
amino acid sequence encoding the epitope within the fixU-length natural protein. Native 
structural features include, but are not limited to, glycosylation and three dimensional 
structure. Preferably, a conformational epitope is produced recombinantly and is 
expressed in a cell fi'ora which it is extractable under conditions which preserve its 

25 desired structural features, e.g. without denaturation of the epitope. Such cells include 
bacteria, yeast, insect, and mammalian cells. Expression and isolation of recombinant 
conformational epitopes fi*om the HCV polyprotein are described in e.g.. International 
Publication Nos. WO 96/04301, WO 94/01778, WO 95/33053, WO 92/08734. 

An "immunological response" to an HCV antigen (including both polypeptide 

30 and polynucleotides encoding polypeptides that are expressed in vivo) or composition is 
the development in a subject of a humoral and/or a cellular immune response to 
molecules present in the composition of interest. For purposes of the present invention. 
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a "humoral immune response" refers to an immune response mediated by antibody 
molecules, while a "cellular immune response" is one mediated by T-lymphocytes 
and/or other white blood cells. One important aspect of cellular immunity involves an 
antigen-specific response by cytolytic T-cells ("CTLs"). CTLs have specificity for 
5 peptide antigens that are presented in association with proteins encoded by the major 
histocompatibility complex (MHC) and expressed on the surfaces of cells. CTLs help 
induce and promote the intracellular destruction of intracellular microbes, or the lysis 
of cells infected with such microbes. Another aspect of cellular immunity involves an 
antigen-specific response by helper T-cells. Helper T-cells act to help stimulate the 

10 function, and focus the activity of, nonspecific effector cells against cells displaying 
peptide antigens in association with MHC molecules on their surface. A "cellular 
immune response" also refers to the production of cytokines, chemokines and other 
such molecules produced by activated T-cells and/or other white blood cells, including 
those derived firom CD4+ and CD8+ T-cells. 

IS A composition or vaccine that elicits a cellular immune response may serve to 

sensitize a vertebrate subject by the presentation of antigen in association with MHC 
molecules at the cell surface. The cell-mediated immune response is directed at, or 
near, cells presenting antigen at their surface. In addition, antigen-specific T- 
lymphocytes can be generated to allow for the future protection of an immunized host. 

20 The ability of a particular antigen to stimulate a cell-mediated immunological 

response may be determined by a number of assays, such as by lymphoproliferation 
(lymphocyte activation) assays, CTL cytotoxic cell assays, or by assaying for T- 
lymphocytes specific for the antigen in a sensitized subject. Such assays are well 
known in the art. See, e.g., Erickson et al., J. Immunol (1993) 151 :41 89-41 99; Doe et 

25 al., Eur. 1 Immunol (1994) 24:2369-2376; and the examples below. 

Thus, an immunological response as used herein may be one which stimulates 
the production of CTLs, and/or the production or activation of helper T- cells. The 
antigen of interest may also elicit an antibody-mediated immune response. Hence, an 
immunological response may include one or more of the following effects: the 

30 production of antibodies by B-cells; and/or the activation of suppressor T-cells and/or 
y6 T-cells directed specifically to an antigen or antigens present in the composition or 
vaccine of interest. These responses may serve to neutralize infectivity, and/or mediate 
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antibody-complement, or antibody dependent cell cytotoxicity (ADCC) to provide 
protection or alleviation of symptoms to an immunized host. Such responses can be 
determined using standard immunoassays and neutralization assays, well known in the 
art. 

5 A "coding sequence" or a sequence which "encodes" a selected polypeptide, is a 

nucleic acid molecule which is transcribed (in the case of DNA) and translated (in the 
case of mRNA) into a polypeptide in vitro or in vivo when placed under the control of 
appropriate regulatory sequences. The boundaries of the coding sequence are 
determined by a start codon at the S' (amino) terminus and a translation stop codon at 

10 the 3' (caiboxy) terminus. A transcription termination sequence may be located 3' to 
the coding sequence. 

A "nucleic acid" molecule or "polynucleotide" can include both double- and 
single-stranded sequences and refers to, but is not limited to, cDNA from viral, 
procaryotic or eucaryotic mRNA, genomic DNA sequences from viral (e.g. DNA 

1 5 viruses and retroviruses) or procaryotic DNA, and especially synthetic DNA sequences. 
The term also captures sequences that include any of the known base analogs of DNA 
andRNA. 

"Operably linked" refers to an arrangement of elements wherein the 
components so described are configured so as to perform their desired function. Thus, 

20 a given promoter operably linked to a coding sequence is capable of effecting the 
expression of the coding sequence when the proper transcription factors, etc., are 
present. The promoter need not be contiguous with the coding sequence, so long as it 
functions to direct the expression thereof Thus, for example, intervening untranslated 
yet transcribed sequences can be present between the promoter sequence and the coding 

25 sequence, as can transcribed introns, and the promoter sequence can still be considered 
"operably linked" to the coding sequence. 

"Recombinant" as used herein to describe a nucleic acid molecule means a 
polynucleotide of genomic, cDNA, viral, semisynthetic, or synthetic origin which, by 
virtue of its origin or manipulation is not associated with all or a portion of the 

30 polynucleotide with which it is associated in nature. The term '^recombinant" as used 
with respect to a protein or polypeptide means a polypeptide produced by expression of 
a recombinant polynucleotide. In general, the gene of interest is cloned and then 
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expressed in transformed organisms, as described further below. The host organism 
expresses the foreign gene to produce the protein under expression conditions. 

A "control element'* refers to a polynucleotide sequence which aids in the 
expression of a coding sequence to which it is linked. The term includes promoters, 
5 transcription termination sequences, upstream regulatory domains, polyadenylation 
signals, untranslated regions, including 5'-UTRs and 3'-UTRs and when appropriate, 
leader sequences and enhancers, which collectively provide for the transcription and 
translation of a coding sequence in a host cell. 

A "promoter" as used herein is a DNA regulatory region capable of binding 

1 0 RNA polymerase in a host cell and initiating transcription of a downstream (3' 
direction) coding sequence operably linked thereto. For purposes of the present 
invention, a promoter sequence includes the minimum number of bases or elements 
necessary to initiate transcription of a gene of interest at levels detectable above 
background. Within the promoter sequence is a transcription initiation site, as well as 

15 protein binding domains (consensus sequences) responsible for the binding of RNA 
polymerase. Eucaryotic promoters will often, but not always, contain "TATA" boxes 
and "CAT" boxes. 

A control sequence "directs the transcription" of a coding sequence in a cell 
when RNA polymerase will bind the promoter sequence and transcribe the coding 
20 sequence into mRNA, which is then translated into the polypeptide encoded by the 
coding sequence. 

"Expression cassette" or "expression construct" refers to an assembly which is 
capable of directing the expression of the sequence(s) or gene(s) of interest. The 
expression cassette includes control elements, as described above, such as a promoter 

25 which is operably linked to (so as to direct transcription of) the sequence(s) or gene(s) 
of interest, and often includes a polyadenylation sequence as well. Within certain 
embodiments of the invention, the expression cassette described herein may be 
contained within a plasmid construct. In addition to the components of the expression 
cassette, the plasmid construct may also include, one or more selectable markers, a 

30 signal which allows the plasmid construct to exist as single-stranded DNA (e.g., a Ml 3 
origin of replication), at least one multiple cloning site, and a "mammalian" origin of 
replication (e.g., a SV40 or adenovirus origin of replication). 
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*Transfonnation," as used herein, refers to the insertion of an exogenous 
polynucleotide into a host cell, irrespective of the method used for insertion: for 
example, transformation by direct uptake, transfection, infection, and the Uke. For 
particular methods of transfection, see further below. The exogenous polynucleotide 
5 may be maintained as a nonintegrated vector, for example, an episome, or alternatively, 
may be integrated into the host genome. 

A "host cell" is a cell which has been transformed, or is capable of 
transformation, by an exogenous DNA sequence. 

By "isolated" is meant, when referring to a polypeptide, that the indicated 
1 0 molecule is separate and discrete from the whole organism with which the molecule is 
found in nature or is present in the substantial absence of other biological macro- 
molecules of the same type. The term "isolated" with respect to a polynucleotide is a 
nucleic acid molecule devoid, in whole or part, of sequences normally associated with 
it in nature; or a sequence, as it exists in nature, but having heterologous sequences in 
15 association therewith; or a molecule disassociated from the chromosome. 

The term "purified" as used herein preferably means at least 75% by weight, 
more preferably at least 85% by weight, more preferably still at least 95% by weight, 
and most preferably at least 98% by weight, of biological macromolecules of the same 
type are present. 

20 "Homology" refers to the percent identity between two polynucleotide or two 

polypeptide moieties. Two DNA, or two polypeptide sequences are "substantially 
homologous" to each other when the sequences exhibit at least about 50% , preferably 
at least about 75%, more preferably at least about 80%-85%, preferably at least about 
90%, and most preferably at least about 95%-98%, or more, sequence identity over a 

25 defined length of the molecules. As used herein, substantially homologous also refers 
to sequences showing complete identity to the specified DNA or polypeptide sequence. 
The term "substantially homologous" as used herein in reference to ANS35 generally 
refers to an HCV nucleic or amino acid sequence that is at least 60% identical to the 
entire sequence of the polypeptide encoded by ANS35 (see FIG. 5), where the sequence 

30 identity is preferably at least 75%, more preferably at least 80%, still more preferably at 
least about 85%, especially more than about 90%, most preferably 95% or greater, 
particularly 98% or greater. These homologous polypeptides include fragments. 
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including mutants and allelic variants of the fragments. Identity between the two 
sequences is preferably determined by the Smith- Waterman homology search algorithm 
as implemented in the MPSRCH program (Oxford Molecular), using an affine gap 
search with parameters gap open penalty=l2 and gap extension penalty^l. Thus, for 
5 example, the present invention includes an isolate which is 80% identical to a 

poljTDeptide encoded by ANS35. In some aspects of the invention, the polypeptide of 
the present invention is substantially homologous to the ANS35. 

In general, "identity" refers to an exact nucleotide-to-nucleotide or amino acid- 
to-amino acid correspondence of two polynucleotides or polypeptide sequences, 

1 0 respectively. Percent identity can be determined by a direct comparison of the 

sequence information between two molecules by aligning the sequences, counting the 
exact number of matches between the two aligned sequences, dividing by the length of 
the shorter sequence, and multiplying the result by 100. Readily available computer 
programs can be used to aid in the analysis, such as ALIGN, Dayhoff, M.O. in Atlas of 

15 Protein Sequence and Structure M.O. Dayhoff ed., 5 SuppL 3:353-358, National 

biomedical Research Foundation, Washington, DC, which adapts the local homology 
algorithm of Smith and V/aiennm Advances in AppL Math. 2:482-489, 1981 for 
peptide analysis. Programs for determining nucleotide sequence identity are available 
in the Wisconsin Sequence Analysis Package, Version 8 (available from Genetics 

20 Computer Group, Madison, WI) for example, the BESTFIT, FASTA and GAP 

programs, which also rely on the Smith and Waterman algorithm. These programs are 
readily utilized with the defauh parameters recommended by the manufacturer and 
described in the Wisconsin Sequence Analysis Package referred to above. For 
example, percent identity of a particular nucleotide sequence to a reference sequence 

25 can be determined using the homology algorithm of Smith and Waterman with a 
default scoring table and a gap penalty of six nucleotide positions. 

Another method of establishing percent identity in the context of the present 
invention is to use the MPSRCH package of programs copyrighted by the University of 
Edinburgh, developed by John F. Collins and Shane S. Sturrok, and distributed by 

30 IntelliGenetics, Inc. (Mountain View, CA). From this suite of packages the Smith- 
Waterman algorithm can be employed where default parameters are used for the 
scoring table (for example, gap open penalty of 12, gap extension penalty of one, and a 
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of six). From the data generated the "Match" value reflects "sequence identity." 
Other suitable programs for calculating the percent identity or similarity between 
sequences are generally known in the art, for example, another alignment program is 
BLAST, used with default parameters. For example, BLASTN and BLASTP can be 
5 used using the following default parameters: genetic code = standard; filter = none; 
strand = both; cutoff = 60; expect = 10; Matrix = BLOSUM62; Descriptions = 50 
sequences; sort by = HIGH SCORE; Databases = non-redundant, GenBank + EMBL + 
DDBJ + PDB + GenBank CDS translations + Swiss protein + Spupdate + PIR. Details 
of these programs can be found at the following internet address: 

10 http://www.ncbi.nlm.gov/cgi-bin/BLAST. 

Alternatively, homology can be determined by hybridization of polynucleotides 
under conditions which form stable duplexes between homologous regions, followed 
by digestion with single-stranded-specific nuclease(s), and size determination of the 
digested fragments. DNA sequences that are substantially homologous can be 

IS identified in a Southern hybridization experiment under, for example, stringent 

conditions, as defined for that particular system. Defining appropriate hybridization 
conditions is within the skill of the art. See, e.g., Sambrook et al., supra\ DNA Clonings 
supra; Nucleic Acid Hybridization, supra. 

"Stringency" refers to conditions in a hybridization reaction that favor 

20 association of very similar sequences over sequences that differ. For example, the 
combination of temperature and salt concentration should be chosen that is 
approximately 120 to 200^C below the calculated Tm of the hybrid under study. The 
temperature and salt conditions can often be determined empirically in preliminary 
experiments in which samples of genomic DNA immobilized on filters are hybridized 

25 to the sequence of interest and then washed under conditions of different stringencies. 
See Sambrook et al. at page 9.50. 

Variables to consider when performing, for example, a Southern blot are (1) the 
complexity of the DNA being blotted and (2) the homology between the probe and the 
sequences bemg detected. The total amount of the firagment(s) to be studied can vary a 

30 magnitude of 10, bom 0.1 to l\xg for a plasmid or phage digest to 10'^ to 10'^ g for a 
smgle copy gene in a highly complex eukaryotic genome. For lower complexity 
polynucleotides, substantially shorter blotting, hybridization, and exposure times, a 
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smaller amount of starting polynucleotides, and lower specific activity of probes can be 
used. For example, a single-copy yeast gene can be detected with an exposure time of 
only I hour starting with 1 |ig of yeast DNA, blotting for two hours, and hybridizing 
for 4-8 hours with a probe of 10^ cpm/^g. For a single-copy mammalian gene a 
5 conservative approach would start with 10 |ig of DNA, blot overnight, and hybridize 
overnight in the presence of 10% dextran sulfate using a probe of greater than 10* 
cpm/^g, resulting in an exposure time of -24 hours. 

Several factors can affect the melting temperature (Tm) of a DNA-DNA hybrid 
between the probe and the fragment of interest, and consequently, the ^propriate 

10 conditions for hybridization and washing. In many cases the probe is not 100% 
homologous to the fragment. Other commonly encountered variables include the 
length and total G+C content of the hybridizing sequences and the ionic strength and 
formamide content of the hybridization buffer. The effects of all of these factors can be 
approximated by a single equation: 

15 Tm= 81 + 16.6(log,oCi) + 0.4[%(G + C)]-0.6(%formamide) - 600/ii-1.5(%mismatch). 
where Ci is the salt concentration (monovalent ions) and n is the length of the hybrid in 
base pairs (slightly modified from Meinkoth & Wahl (1984) Anal. Biochem. 1 38: 267- 
284). In general, convenient hybridization temperatures in the presence of 50% 
formamide are 42®C for a probe with is 95% to 100% homologous to the target 

20 fragment, 37°C for 90% to 95% homology, and 32°C for 85% to 90% homology. For 
lower homologies, formamide content should be lowered and temperature adjusted 
accordingly, using the equation above. If the homology between the probe and the 
target fragment are not known, the simplest approach is to start with both hybridization 
and wash conditions which are nonstringent. If non-specific bands or high background 

25 are observed after autoradiography, the filter can be washed at high stringency and 
reexposed. If the time required for exposure makes this approach impractical, several 
hybridization and/or washing stringencies should be tested in parallel. 

By ''nucleic acid immunization" is meant the introduction of a nucleic acid 
molecule encoding one or more selected antigens into a host cell, for the in vivo 

30 expression of the antigen or antigens. The nucleic acid molecule can be introduced 
directly into the recipient subject, such as by injection, inhalation, oral, intranasal and 
mucosal administration, or the like, or can be introduced ex vivo, into cells which have 
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into the subject where an immune response can be mounted against the antigen encoded 
by the nucleic acid molecule. 

An "open reading frame" or ORF is a region of a polynucleotide sequence 
S which encodes a polypeptide; this region can represent a portion of a coding sequence 
or a total coding sequence. 

As used herein, the term "antibody" refers to a polypeptide or group of 
polypeptides which comprise at least one antigen binding site. An "antigen binding 
site" is formed from the folding of the variable domains of an antibody molecule(s) to 

10 form three-dimensional binding sites with an internal surface shape and charge 

distribution complementary to the features of an epitope of an antigen, which allows 
specific binding to form an antibody-antigen complex. An antigen binding site may be 
formed from a heavy- and/or light-chain domain (VH and VL, respectively), which 
form hypervariable loops which contribute to antigen binding. The term "antibody" 

15 includes, without limitation, polyclonal antibodies, monoclonal antibodies, chimeric 
antibodies, altered antibodies, univalent antibodies. Fab proteins, and single-domain 
antibodies. In many cases, the binding phenomena of antibodies to antigens is 
equivalent to other ligand/anti-ligand binding. 

If polyclonal antibodies are desired, a selected mammal (e.g., mouse, rabbit, 

20 goat, horse, etc.) is immunized with an immunogenic polypeptide bearing an HCV 
epitope(s). Serum from the immunized animal is collected and treated according to 
known procedures. If serum containing polyclonal antibodies to an HCV epitope 
contains antibodies to other antigens, the polyclonal antibodies can be purified by 
immimoaffinity chromatography. Techniques for producing and processing polyclonal 

25 antisera are known in the art, see for example, Mayer and Walker, eds. (1987) 
IMMUNOCHEMICAL METHODS IN CELL AND MOLECULAR BIOLOGY 
(Academic Press, London). 

Monoclonal antibodies directed against HCV epitopes can also be readily 
produced by one skilled in the art. The general methodology for making monoclonal 

30 antibodies by hybridomas is well known. Immortal antibody-producing cell lines can 
be created by cell fusion, and also by other techniques such as direct transformation of 
B l)miphocytes with oncogenic DNA, or transfection with Epstein-Barr virus. See, e.g., 
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M. Schreier et al. (1980) HYBRIDOMA TECHNIQUES; Hammerling et al. (1981), 
MONOCLONAL ANTmODIES AND T-CELL HYBRIDOMAS; Kennett et al. 
(1980) MONOCLONAL ANTffiODIES; see also, U.S. Pat. Nos. 4,341,761; 4,399,121; 
4,427,783; 4,444,887; 4,466,917; 4,472,500; 4,491,632; and 4,493,890. Panels of 
5 monoclonal antibodies produced against HCV epitopes can be screened for various 
properties; i.e., for isotype, epitope affinity, etc. As used herein, a "single domain 
antibody" (dAb) is an antibody which is comprised of an HL domain, which binds 
specifically with a designated antigen. A dAb does not contain a VL domain, but may 
contain other antigen binding domains known to exist to antibodies, for example, the 

10 kappa and lambda domains. Methods for preparing dabs are known in the art. See, for 
example, Ward et al. Nature 341 : 544 (1989). 

Antibodies can also be comprised of VH and VL domains, as well as other 
known antigen binding domains. Examples of these types of antibodies and methods 
for their preparation and known in the art (see, e.g., U.S. Pat. No. 4,816,467), and 

1 5 include the following. For example, "vertebrate antibodies" refers to antibodies which 
are tetramers or aggregates thereof, comprising light and heavy chains which are 
usually aggregated in a " Y" configuration and which may or may not have covalent 
linkages between the chains. In vertebrate antibodies, the amino acid sequences of the 
chains are homologous with those sequences found in antibodies produced in 

20 vertebrates, whether in situ or in vitro (for example, in hybridomas). Vertebrate 
antibodies include, for example, purified polyclonal antibodies and monoclonal 
antibodies, methods for the preparation of which are described infi^. 

"Hybrid antibodies" are antibodies where chains are separately homologous 
with reference to mammalian antibody chains and represent novel assemblies of them, 

25 so that two different antigens are precipitable by the tetramer or aggregate. In hybrid 
antibodies, one pair of heavy and light chains are homologous to those found in an 
antibody raised against a first antigen, while a second pair of chains are homologous to 
those found in an antibody raised against a second antibody. This results in the property 
of "divalence", i.e., the ability to bind two antigens simultaneously. Such hybrids can 

30 also be formed using chimeric chains, as set forth below. 

"Chimeric antibodies" refers to antibodies in which the heavy and/or light 
chains are fiision proteins. Typically, one portion of the amino acid sequences of the 



-22- 



wo 01/38360 



PCT/USOO/32326 



chain is homologous to corresponding sequences in an antibody derived from a 
particular species or a particular class, while the remaining segment of the chain is 
homologous to the sequences derived from another species and/or class. Usually, the 
variable region of both light and heavy chains mimics the variable regions or antibodies 
5 derived from one species of vertebrates, while the constant portions are homologous to 
the sequences in the antibodies derived from another species of vertebrates. However, 
the definition is not limited to this particular example. Also included is any antibody in 
which either or both of the heavy or light chains are composed of combinations of 
sequences mimicking the sequences in antibodies of different sources, whether these 

10 sources be from differing classes or different species of origin, and whether or not the 
fusion point is at the variable/constant boundary. Thus, it is possible to produce 
antibodies in which neither the constant nor the variable region mimic know antibody 
sequences. It then becomes possible, for example, to construct antibodies whose 
variable region has a higher specific affinity for a particular antigen, or whose constant 

1 S region can elicit enhanced complement fixation, or to make other improvements in 
properties possessed by a particular constant region. 

Another example is "altered antibodies", which refers to antibodies in which the 
naturally occurring amino acid sequence in a vertebrate antibody has been varies. 
Utilizing recombinant DNA techniques, antibodies can be redesigned to obtain desired 

20 characteristics. The possible variations are many, and range from the changing of one 
or more amino acids to the complete redesign of a region, for example, the constant 
region. Changes in the constant region, in general, to attain desired cellular process 
characteristics, e.g., changes in complement fixation, interaction with membranes, and 
other effector functions. Changes in the variable region can be made to alter antigen 

25 binding characteristics. The antibody can also be engineered to aid the specific delivery 
of a molecule or substance to a specific cell or tissue site. The desired alterations can be 
made by known techniques in molecular biology, e.g., recombinant techniques, site- 
directed mutagenesis, etc. 

Yet another example are "univalent antibodies", which are aggregates 

30 comprised of a heavy-chain/light-chain dimer bound to the Fc (i.e., stem) region of a 
second heavy chain. This type of antibody escapes antigenic modulation. See, e.g., 
Glennie et al. Nature 295: 712 (1982). Included also within the definition of antibodies 
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are "Fab" fragments of antibodies. The "Fab" region refers to those portions of the 
heavy and light chains which are roughly equivalent, or analogous, to the sequences 
which comprise the branch portion of the heavy and light chains, and which have been 
shown to exhibit immunological binding to a specified antigen, but which lack the 
5 effector Fc portion. "Fab" includes aggregates of one heavy and one light chain 
(commonly known as Fab'), as well as tetramers containing the 2H and 2L chains 
(referred to as F(ab)2), which are capable of selectively reacting with a designated 
antigen or antigen family. Fab antibodies can be divided into subsets analogous to those 
described above, i.e., "vertebrate Fab", "hybrid Fab", "chimeric Fab", and "altered Fab". 
10 Methods of producing Fab fragments of antibodies are known within the art and 
include, for example, proteolysis, and synthesis by recombinant techniques. 

"Antigen-antibody complex" refers to the complex formed by an antibody that 
is specifically bound to an epitope on an antigen. 

"Inununogenic polypeptide" refers to a polypeptide that elicits a cellular and/or 
1 S humoral immune response in a mammal, whether alone or linked to a carrier, in the 
presence or absence of an adjuvant. 

"Antigenic detenninant" refers to the site on an antigen or hapten to which a 
specific antibody molecule or specific cell surface receptor binds. 

As used herein, "treatment" refers to any of (i) the prevention of infection or 
20 reinfection, as in a traditional vaccine, (ii) the reduction or elimination of symptoms, 

and (iii) the substantial or complete elimination of the pathogen in question. Treatment 
may be effected prophylactically (prior to infection) or therapeutically (following 
infection). 

By "vertebrate subject" is meant any member of the subphylum cordata, 
25 including, without limitation, humans and other primates, including non-human 

primates such as chimpanzees and other apes and monkey species; farm animals such 
as cattle, sheep, pigs, goats and horses; domestic mammals such as dogs and cats; 
laboratory animals including rodents such as mice, rats and guinea pigs; birds, 
including domestic, wild and game birds such as chickens, turkeys and other 
30 gallinaceous birds, ducks, geese, and the like. The term does not denote a particular 
age. Thus, both adult and newborn individuals are intended to be covered. The 
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invention described herein is intended for use in any of the above vertebrate species, 
since the immune systems of all of these vertebrates operate similarly. 

II. Modes ofCarrving out the Invention 
5 Before describing the present invention in detail, it is to be understood that this 

invention is not limited to particular formulations or process parameters as such may, of 
course, vary. It is also to be understood that the terminology used herein is for the 
purpose of describing particular embodiments of the invention only, and is not intended 
to be limiting. 

10 Although a number of compositions and methods similar or equivalent to those 

described herein can be used in the practice of the present invention, the preferred 
materials and methods are described herein. 

General Overview 

1 5 An aim of an HCV vaccine is to generate broad immimity to a wide breadth of 

antigens because HCV is so divergent and because humoral as well as cellular immune 
responses are desirable to combat this human pathogen. While antibodies generated 
against the envelope glycoprotein(s) might aid in virus neutralization, there is 
additional benefit to be derived from a vaccine that includes other regions. The 

20 likelihood of T-helper responses generated against a polypeptide would be helpful in a 
vaccine setting as would generation of cytotoxic T cells. The non- structural region 
represents such a candidate antigen, but processing by the protease generates several 
polypeptides, making purification complicated. It would be advantageous, therefore, to 
derive a non-structural cassette that is unprocessed by the NS3 protease. 

25 The present invention solves this and other problems using compositions and 

methods involving an N-terminal deletion in NS3, which removes the catalytic domain. 
As such, some or all of the remainder of the non-structural region (through NS5B) is 
expressed as an intact polypeptide. Expression of this species has been documented in 
mammalian cells as well as in yeast. Further, in certain aspects, polynucleotides 

30 encoding HCV core polypeptides (or fragments thereof) are added (e.g,. operably 
linked) to the carboxy-terminus of the non-structural cassette. As the core coding 
region is relatively highly conserved among HCV isolates, the presence of this region 
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may enhance the immune response. Because core has at its C-terminus a very 
hydrophobic domain (amino acids 174-191), shorter versions of core were also 
engineered onto the polypeptide. As described in detail herein, the truncation of core to 
amino acid 121 yielded higher expression than the amino acid 173 truncation when 
5 engineered onto the C-terminus of the mutant NS polypeptide. The combination of 
most of the non-structural region fused to a C-terminally truncated core into a 
polypeptide is novel and has advantages for vaccine immunization. Moreover, because 
the aim is not necessarily to generate antibody responses to this polypeptide, there is no 
need to maintain a native conformation, enabling a more facile purification protocol. 

10 

Mutant HCV Non-Structural Polypeptides 

Genomes of HCV strains contain a single open reading frame of £4)proximately 
9,000 to 12,000 nucleotides, which is transcribed into a polyprotein. An HCV 
polyprotein is cleaved to produce at least ten distinct products, in the order of NHj- 

1 5 Core-El-E2-p7-NS2-NS3-NS4a-NS4b-NS5a-NS5b-COOH. Mutant HCV 

polypeptides of the invention contain an N-terminal deletion in NS3, which removes or 
disables the catalytic domain. Preferably, the polypeptides also include the remainder 
of the non-structural region, although in certain embodiments, the polypeptides may 
include less than all of the remaining NS polypeptides, for example mutant NS 

20 polypeptides includmg any combinations of NS2-NS3-NS4a-NS4b-NS5a-NS5b (e.g., 
NS3NS3-NS5a.NS5b; NS3-NS4a-NS4b; NS3-NS4a-NS4b-NS5a; NS3-NS4b-NS5a- 
NS5b; NS3-NS4a-NS5a; NS3-NS4b-NS5a; NS3-NS4b-NS5b; etc.). 

The HCV NS3 protein functions as a protease and a helicase and occurs at 
approximately amino acid 1027 to amino acid 1657 of the polyprotein (numbered 

25 relative to HCV-1). See Choo et al (1991) Proc. Natl. Acad. Sci. USA 88:2451-2455. 
HCV NS4 occurs at approximately amino acid 1658 to amino acid 1972, NS5a occurs 
at approximately amino acid 1973 to amino acid 2420, and HCV NS5b occurs at 
approximately amino acid 2421 to amino acid 301 1 of the polyprotein (numbered 
relative to HCV-1) (Choo et aL, 1991). 

30 The mutant polypeptides described herein can either be full-length polypeptides 

or portions of NS3, NS4 (NS4a and NS4b), NS5a, and NS5b polypeptides. Epitopes of 
NS3, NS4 (NS4a and NS4b), NS5a, NS5b, NS3NS4NS5a, and NS3NS4NS5aNS5b can 
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be identified by several methods. For example, NS3, NS4, NS5a, NS5b polypeptides 
or fusion proteins comprising any combination of the above, can be isolated, for 
example, by immunoaffinity purification using a monoclonal antibody for the 
polypeptide or protein. The isolated protein sequence can then be screened by 
5 preparing a series of short peptides by proteolytic cleavage of the purified protein, 
which together span the entire protein sequence. By starting with, for example, 
100-mer polypeptides, each polypeptide can be tested for the presence of epitopes 
recognized by a T cell receptor on an HCV-activated T cell, progressively smaller and 
overleaping firagments can then be tested from an identified 100-mer to map the epitope 
10 of interest. 

Epitopes recognized by a T cell receptor on an HCV-activated T cell can be 
identified by, for example, ^'Cr release assay (see Example 2) or by 
lymphoproliferation assay (see Example 4). In a ^*Cr release assay, target cells can be 
constructed that display the epitope of interest by cloning a polynucleotide encoding the 

1 5 epitope into an expression vector and transforming the expression vector into the target 
cells. Non-structural polypeptides can occur in any order in the fiision protein. If 
desired, at least 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more of one or more of the polypeptides 
may occur in the fiision protein. Multiple viral strains of HCV occur, and NS3, NS4, 
NS5a, and NS5b polypeptides of any of these strains can be used in a fiision protein. 

20 Nucleic acid and amino acid sequences of a number of HCV strains and 

isolates, including nucleic acid and amino acid sequences of NS3, NS4, NS5a, NS5b 
genes and polypeptides have been determined. For example, isolate HCV Jl . 1 is 
described in Kubo et al (1989) Japan. Nucl. Acids Res. 17:10367-10372; Takeuchi et 
al.(l990) Gene 91:287-291; Takeuchi et al. (1990) J. Gen. Virol. 71:3027-3033; and 

25 Takeuchi et al (1990) Nucl. Acids Res. 1 8:4626. The complete coding sequences of 
two independent isolates, HCV-J and BK, are described by Kato et al, (1990) Proc. 
Natl. Acad. Sci. USA 87:9524-9528 and Takamizawa et al, (1991) J. Virol. 
65 : 1 1 05-1 1 1 3 respectively. 

Publications that describe HCV-l isolates include Choo et al (1990) Brit. Med. 

30 Bull. 46:423-441; Choo et al (1991) Proc. Natl. Acad. Sci. USA 88:2451-2455 and 
Han et al (1991) Proc. Natl. Acad. Sci. USA 88:171 1-1715. HCV isolates HC-Jl and 
HC-J4 are described in Okamoto et al (1991) Japan J. Exp. Med. 60:167-177. HCV 
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isolates HCT 18-, HCT 23, Th, HCT 27, ECl and EClO are described in Weiner et at. 
(1991) Virol. 180:842-848. HCV isolates Pt-1, HCV-Kl and HCV-K2 are described in 
Enomoto et al (1990) Biochem. Biophys. Res. Commun. 170:1021-1025. HCV 
isolates A, C, D & E are described in Tsukiyama-Kohara et a/. (1991) Virus Genes 
5 5:243-254. 

Each of the mutant HCV polypeptides containing at least portions of NS3, NS4 
and NSS can be obtained from the same HCV strain or isolate or from different HCV 
strains or isolates. Thus, each non-structural region of the polypeptide can be from the 
same HCV strain or isolate or from each different HCV strains or isolates. In addition 

10 to the mutant HCV non*structural polypeptides described herein, the proteins can 

contain other polypeptides derived from the HCV polyprotein. For example, it may be 
desirable to include polypeptides derived from the core region of the HCV polyprotein. 
This region occurs at amino acid positions 1-191 of the HCV polyprotein, numbered 
relative to HCV-1. Either the friU-length protein or epitopes of the full-length protein 

15 may be used in the subject fusions, such as those epitopes found between amino acids 
10-53, amino acids 10-45, amino acids 67-88, amino acids 120-130, or any of the core 
epitopes identified in, e.g., Houghton et al., U.S. Patent No. 5,350,671; Chien et al., 
Proc. Natl Acad. Set USA (1992) 89:1001 1-10015; Chien et al., J. Gastroent, Hepatol 
(1993) 8:S33-39; Chien et al., International Publication No. WO 93/00365; Chien, 

20 D.Y., International Publication No. WO 94/01778; and commonly owned, U.S. Patent 
No. 6,150,087. When present, additional non-structural HCV polypeptides such as core 
can be obtained from the same HCV strain or isolate or from different HCV strains or 
isolates. 

Preferably, the above-described mutant proteins, as well as the individual 
25 components of these proteins, are produced recombinantly. A polynucleotide encoding 
these proteins can be introduced into an expression vector which can be expressed in a 
suitable expression system. A variety of bacterial, yeast, mammalian, insect and plant 
expression systems are available in the art and any such expression system can be used. 
Optionally, a polynucleotide encoding these proteins can be translated in a cell-free 
30 translation system. Such methods are well known in the art. The proteins also can be 
constructed by solid phase protein synthesis. 
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If desired, the mutant polypeptides, or the individual components of these 
polypeptides, also can contain other amino acid sequences, such as amino acid linkers 
or signal sequences, as well as ligands useful in protein purification, such as 
glutathione-S-transferase and staphylococcal protein A. 

5 

Polynucleotides 

The polynucleotides of the present invention are not necessarily physically 
derived from the nucleotide sequences shown, but can be generated in any manner, 
including, for example, chemical synthesis or DNA replication or reverse transcription 
10 or transcription. In addition, combinations of regions corresponding to that of the 

designated sequences can be modified in ways known to the art to be consistent with an 
intended use. 

The DNA encoding the desired polypeptide, whether in fused or mature form, 
and whether or not containing a signal sequence to permit secretion, can be ligated into 

IS expression vectors suitable for any convenient host. Both eukaiyotic and prokaryotic 
host systems are presently used in forming recombinant polypeptides, and a summary 
of some of the more common control systems and host cell is given below. The 
polypeptide produced in such host cells is then isolated from lysed cells or from the 
culture mediiun and purified to the extent needed for its intended use. 

20 Purification can be by techniques known in the art, for example, differential 

extraction, salt fractionation, chromatography on ion exchange resins, affinity 
chromatography, centrifiigation, alkali resolubilization of insoluble protein, and the 
like. See, for example. Methods in Enzymology for a variety of methods for purifying 
proteins. 

25 Polynucleotides contain less than an entire HCV genome and can be RNA or 

single- or double-stranded DNA. Preferably, the polynucleotides are isolated free of 
other components, such as proteins and lipids. Polynucleotides of the invention can 
also comprise other nucleotide sequences, such as sequences coding for linkers, signal 
sequences, or ligands useful in protein purification such as glutathione-S-transferase 

30 and staphylococcal protein A. 

Polynucleotides encoding mutant HCV non-structural polypeptides can be 
isolated from a genomic library derived from nucleic acid sequences present in, for 
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example, the plasma, serum, or liver homogenate of an HCV infected individual or can 
be synthesized in the laboratory, for example, using an automatic synthesizer. An 
amplification method such as PGR can be used to amplify polynucleotides from either 
HCV genomic DNA or cDNA. 
5 Further, while the polypeptides that are not NS3, NS4, or NS5 of HCV of the 

present invention can comprise a substantially complete viral domain, in many 
applications all that is required is that the polypeptide comprise an antigenic or 
immunogenic region of the virus. An antigenic region of a polypeptide is generally 
relatively small-typically 8 to 10 amino acids or less in length. Fragments of as few as S 

10 amino acids can characterize an antigenic region. These segments can correspond to 

regions of, for example, C, El, or E2 epitopes. Accordingly, using the cDNAs of C, El, 
or E2 as a basis, DNAs encoding short segments of C, El, or E2 polypeptides can be 
expressed recombinantly either as fusion proteins, or as isolated polypeptides. In 
addition, short amino acid sequences can be conveniently obtained by chemical 

15 synthesis. 

Polynucleotides encoding the polypeptides described herein can comprise 
coding sequences for these polypeptides which occur naturally or can be artificial 
sequences which do not occur in nature. These polynucleotides can be ligated to form a 
coding sequence for the fusion proteins using standard molecular biology techniques. 

20 If desired, polynucleotides can be cloned into an expression vector and transformed 
into, for example, bacterial, yeast, insect, plant or mammalian cells so that the fusion 
proteins of the invention can be expressed in and isolated from a cell culture. 

The expression of polypeptides containing these domains in a variety of 
recombinant host cells, including, for example, bacteria, yeast, insect, plant and 

25 vertebrate cells, give rise to important immunological reagents which can be used for 
diagnosis, detection, and vaccines. 

The general techniques used in extracting the genome from a virus, preparing 
and probing a cDNA library, sequencing clones, constructing expression vectors, 
transforming cells, performing inunimological assays such as radioimmunoassays and. 

30 ELISA assays, for growing cells in culture, and the like are known in the art and 

laboratory manuals are available describing these techniques. However, as a general 
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guide, the following sets forth some sources currently available for such procedures^ 
and for materials useful in carrying them out. 

Both prokaryotic and eukaryotic host cells may be used for expression of 
desired coding sequences when appropriate control sequences which are compatible 
5 with the designated host are used. Among prokaryotic hosts, E. coli is most frequently 
used. Expression control sequences for prokaryotes include promoters, optionally 
containing operator portions, and ribosome binding sites. Transfer vectors compatible 
with prokaryotic hosts are commonly derived from, for example, pBR322, a plasmid 
containing operons conferring ampicillin and tetracycline resistance, and the various 

10 pUC vectors, which also contain sequences conferring antibiotic resistance markers. 

These markers may be used to obtain successfiil transformants by selection. Commonly 
used prokaryotic control sequences include the Beta-lactamase (penicillinase) and 
lactose promoter systems (Chang et al. (1977), Nature 198:1056), the tryptophan (trp) 
promoter system (Goeddel et al. (1980) Nucleic Acid Res. 8:4057), the lambda-derived 

15 P[L Jpromoter and N gene ribosome binding site (Shimatake et al. (1981) Nature 
292:128) and the hybrid tac promoter (De Boer et al. (1983) Proc. Natl. Acad. Sci. 
U.S.A. 292:128) derived &om sequences of the trp and lac UV5 promoters. The 
foregoing systems are particularly compatible with E. coli; if desired, other prokaryotic 
hosts such as strains of Bacillus or Pseudomonas may be used, with corresponding 

20 control sequences. 

Eukaryotic hosts include mammalian and yeast cells in culture systems. 
Mammalian cell lines available as hosts for expression are known in the art and include 
many immortalized cell lines available from the American Type Culture Collection 
(ATCC), including HeLa cells, Chinese hamster ovary (CHO) cells, baby hamster 

25 kidney (BHK) cells, and a number of other cell lines. Suitable promoters for 

mammalian cells are also known in the art and include viral promoters such as that 
from Simian Virus 40 (SV40) (Fiers (1978), Nature 273:1 13), Rous sarcoma virus 
(RSV), adenovirus (ADV), and bovine papilloma virus (BPV). Mammalian cells may 
also require terminator sequences and poly A addition sequences; enhancer sequences 

30 which increase expression may also be included, and sequences which cause 

amplification of the gene may also be desirable. These sequences are known in the art. 
Vectors suitable for repUcation in mammalian cells may include viral replicons, or 
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sequences which insure integration of the appropriate sequences encoding NANBV 
epitopes into the host genome. 

The vaccinia virus system can also be used to express foreign DNA in 
mammalian cells. To express heterologous genes, the foreign DNA is usually inserted 
5 into the thjmiidine kinase gene of the vaccinia virus and then infected cells can be 
selected. This procedure is known in the art and further information can be found in 
these references (Mackett et al J. Virol. 49: 857-864 (1984) and Chapter 7 in DNA 
Cloning, Vol. 2, IRL Press). 

Yeast expression systems are also known to one of ordinary skill in the art. A 

10 yeast promoter is any DNA sequence capable of binding yeast RNA polymerase and 
initiating the downstream (3*) transcription of a coding sequence (e.g., structural gene) 
into mRNA. A promoter will have a transcription initiation region which is usually 
placed proximal to the S* end of the coding sequence. This transcription initiation 
region usually includes an RNA polymerase binding site (the "TATA Box") and a 

1 S transcription initiation site. A yeast promoter may also have a second domain called an 
upstream activator sequence (UAS), which, if present, is usually distal to the structural 
gene. The UAS permits regulated (inducible) expression. Constitutive expression 
occurs in the absence of a UAS. Regulated expression may be either positive or 
negative, thereby either enhancing or reducing transcription. 

20 Yeast is a fermenting organism with an active metaboHc pathway, therefore 

sequences encoding enzymes in the metaboUc pathway provide particularly useful 
promoter sequences. Examples include alcohol dehydrogenase (ADH) (EP-A-0 284 
044), enolase, glucokinase, glucose-6-phosphate isomerase, gIyceraldehyde-3- 
phosphate-dehydrogenase (GAP or GAPDH), hexokinase, phosphofructokinase, 3- 

25 phosphoglycerate mutase, and pyruvate kinase (PyK) (EPO-A-0 329 203). The yeast 
PH05 gene, encoding acid phosphatase, also provides useful promoter sequences 
(Myanohara ^i/. (1983) Proc. Natl Acad. ScL USA 80:1), 

In addition, synthetic promoters which do not occur in nature also function as 
yeast promoters. For example, UAS sequences of one yeast promoter may be joined 

30 with the transcription activation region of another yeast promoter, creating a synthetic 
hybrid promoter. Examples of such hybrid promoters include the ADH regulatory 
sequence linked to the GAP transcription activation region (US Patent Nos. 4,876,197 



-32- 



wo 01/38360 PCT/USOO/32326 

and 4,880,734). Other examples of hybrid promoters include promoters which consist 
of the regulatory sequences of either the ADH2, GAL4, GAL 10, OR PH05 genes, 
combined with the transcriptional activation region of a glycolytic enzyme gene such as 
GAP or PyK (EP-A-0 164 556). Furthermore, a yeast promoter can include naturally 
5 occurring promoters of non-yeast origin that have the ability to bind yeast RNA 

polymerase and initiate transcription. Examples of such promoters include, inter alia, 
(Cohen e/fl/. (1980) Proc. Nati Acad, Scu C/&4 77:1078; Henikoffe^ a/. (1981) 
TVa/wre 255:835; HoUenbergc/fl/. (1981) Cwrr. Topics Microbiol Immunol 96:U9\ 
HoUenberg et al (1979) "The Expression of Bacterial Antibiotic Resistance Genes in 

10 the Yeast Saccharomyces cerevisiae," in: Plasmids of Medical, Environmental and 

Commercial Importance K.N. TimmisandA. Puhler); Mercerau-Puigalon a/. 
(1980) Gene /7:163;Panthiere/fl/. (1980) Cwrr. Genet 2:109). 

A DNA molecule may be expressed intracellularly in yeast. A promoter 
sequence may be directly linked with the DNA molecule, in which case the first amino 

1 5 acid at the N-terminus of the recombinant protein will always be a methionine, which is 
encoded by the ATG start codon. If desired, methionine at the N-terminus may be 
cleaved from the protein by in vitro incubation with cyanogen bromide. 

Fusion proteins provide an altemative for yeast expression systems, as well as 
in mammalian, baculovirus, and bacterial expression systems. Usually, a DNA 

20 sequence encoding the N-terminal portion of an endogenous yeast protein, or other 
stable protein, is fused to the 5' end of heterologous coding sequences. Upon 
expression, this construct will provide a fusion of the two amino acid sequences. For 
example, the yeast or human superoxide dismutase (SOD) gene, can be linked at the 5' 
terminus of a foreign gene and expressed in yeast. The DNA sequence at the junction 

25 of the two amino acid sequences may or may not encode a cleavable site. See e.g., EP- 
A-0 196 056. Another example is a ubiquitin fusion protein. Such a fusion protein is 
made with the ubiquitin region that preferably retains a site for a processing enzyme 
(e.g., ubiquitin-specific processing protease) to cleave the ubiquitin from the foreign 
protein. Through this method, therefore, native foreign protein can be isolated {e.g., 

30 WO88/024066). 

Altematively, foreign protems can also be secreted from the cell into the growth 
media by creating chimeric DNA molecules that encode a fusion protein comprised of a 
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leader sequence fragment that provide for secretion in yeast of the foreign protein. 
Preferably, there are processing sites encoded between the leader fragment and the 
foreign gene that can be cleaved either in vivo or in vitro. The leader sequence 
fragment usually encodes a signal peptide comprised of hydrophobic amino acids 
5 which direct the secretion of the protein from the cell. 

DNA encoding suitable signal sequences can be derived from genes for secreted 
yeast proteins, such as the yeast invertase gene (EP-A-0 012 873; JPO. 62,096,086) 
and the A-factor gene (US patent 4,588,684). Alternatively, leaders of non-yeast 
origin, such as an interferon leader, exist that also provide for secretion in yeast (EP-A- 
10 0 060 057). 

A preferred class of secretion leaders are those that employ a fragment of the 
yeast alpha-factor gene, which contains both a "pre" signal sequence, and a "pro" 
region. The types of alpha-factor fragments that can be employed include the ftiU- 
length pre-pro alpha factor leader (about 83 amino acid residues) as well as truncated 

15 alpha-factor leaders (usually about 25 to about 50 amino acid residues) (US Patents 
4,546,083 and 4,870,008; EP-A-0 324 274). Additional leaders employing an alpha- 
factor leader fragment that provides for secretion include hybrid alpha-factor leaders 
made with a presequence of a first yeast, but a pro-region from a second yeast 
alphafactor. (e,g., see WO 89/02463.) 

20 Usually, transcription termination sequences recognized by yeast are regulatory 

regions located 3' to the translation stop codon, and thus together with the promoter 
flank the coding sequence. These sequences direct the transcription of an mRNA 
which can be translated into the polypeptide encoded by the DNA. Examples of 
transcription terminator sequence and other yeast-recognized termination sequences, 

25 such as those coding for glycolytic enzymes. 

Usually, the above described components, comprising a promoter, leader (if 
desired), coding sequence of interest, and transcription termination sequence, are put 
together into expression constructs. Expression constructs are often maintained in a 
replicon, such as an extrachromosomal element (e.g., plasmids) capable of stable 

30 maintenance in a host, such as yeast or bacteria. The replicon may have two replication 
systems, thus allowing it to be maintained, for example, in yeast for expression and in a 
prokaryotic host for cloning and amplification. Examples of such yeast-bacteria shuttle 
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vectors include YEp24 (Botstein et al (1979) Gene 5:17-24), pCl/1 (Brake et aL 
(1984) Proc. Natl Acad, 5a 17X4 57:4642-4646), and YRp 17 (Stinchcomb a/. 
(1982) J, Moi Biol 755:157). In addition, a repiicon may be either a high or low 
copy number plasmid. A high copy number plasmid will generally have a copy number 
5 ranging from about 5 to about 200, and usually about 10 to about 150. A host 

containing a high copy number plasmid will preferably have at least about 10, and more 
preferably at least about 20. Enter a high or low copy number vector may be selected, 
depending upon the effect of the vector and the foreign protein on the host. See e,g.. 
Brake et al^ supra, 

10 Alternatively, the expression constructs can be integrated into the yeast genome 

with an integrating vector. Integrating vectors usually contain at least one sequence 
homologous to a yeast chromosome that allows the vector to integrate, and preferably 
contain two homologous sequences flanking the expression construct. Integrations 
appear to result from recombinations between homologous DNA in the vector and the 

15 yeast chromosome (Qrr-Weavere/ a/. (19S3) Methods in EnzymoL 707:228-245). An 
integrating vector may be directed to a specific locus in yeast by selecting the 
appropriate homologous sequence for inclusion in the vector. See Orr- Weaver et al.y 
supra. One or more expression construct may integrate, possibly affecting levels of 
recombinant protein produced (Rine a/. (1983) Proc. Natl Acad. ScL USA 

20 50:6750). The chromosomal sequences included in the vector can occur either as a 

single segment in the vector, which results in the integration of the entire vector, or two 
segments homologous to adjacent segments in the chromosome and flanking the 
expression construct in the vector, which can result in the stable integration of only the 
expression construct. 

25 Usually, extrachromosomal and integrating expression constructs may contain 

selectable markers to allow for the selection of yeast strains that have been transformed. 
Selectable markers may include biosynthetic genes that can be expressed in the yeast 
host, such ZSADE2, HIS4, LEU2, TRPI, md ALG7, and the G418 resistance gene, 
which confer resistance in yeast cells to tunicamycin and G418, respectively. In 

30 addition, a suitable selectable marker may also provide yeast with the ability to grow in 
the presence of toxic compounds, such as metal. For example, the presence of CUPl 
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allows yeast to grow in the presence of copper ions (Butt et ai (1 987) Microbiol, Rev, 
57:351). 

Alternatively, some of the above described components can be put together into 
transformation vectors. Transformation vectors are usually comprised of a selectable 
S marker that is either maintained in a replicon or developed into an integrating vector, as 
described above. 

Expression and transformation vectors, either extrachromosomal replicons or 
integrating vectors, have been developed for transformation into many yeasts. For 
example, expression vectors have been developed for, inter alia, the following yeasts: 

10 Candida albicans (Kurtz, eM/. (1986) A/o/. Cell. Biol 5:142), Candida maltosa 

(Kunze, e/a/. (1985) Basic Microbiol 25:141). Hansenula polymorpha (Gleeson, 
etal (1986)/. Gen. Microbiol 732:3459; Roggenkamp a/. (1986) Afo/. Gen. 
Genet 202:302), Kluyveromyces fragilis (Das, eM/. (1984) J. Bacteriol 755:1165), 
Kluyveromyces lactis (De Louvencourt e/ a/. (1983) J. Bacteriol 75^:737; Van den 

15 Berg a/. (1990) Ao/TecA/io/ogy 5:135), Pichia guillerimondii (Kunzee/ a/. (1985) 
J. Basic Microbiol 25:141), Pichiapastoris (Cregg, a/. (1985) A/o/. Cell Biol 
5:3376; US Patent Nos. 4,837,148 and 4,929,555), Saccharomyces cerevisiae (Hinnen 
etal (197S) Proc. Natl Acad. Set USA 75:1929; lio et al (1983)7. Bacteriol 
75i:163), Schizosaccharomyces pombe (Beach and Nurse (1981) Nature 300:706), and 

20 Yarrowialipolytica(Davidow,e/a/. (1985) Cwrr. Genet. 70:380471 Gaillardin, a/. 
(1985) CMfr. Genet. 70:49). 

Methods of introducing exogenous DNA into yeast hosts are well-known in the 
art, and usually include either the transformation of spheroplasts or of intact yeast cells 
treated with alkali cations. Transformation procedures usually vary with the yeast 

25 species to be transformed. (See e.g., Kurtz et al (1986) Mo/, Cell Biol 6:142; 
Kunzee/a/. (1985)7. Basic Microbiol 25:141; Candida; Gleeson era/. (1986) J. 
Gen. Microbiol 752:3459; Roggenkamp eM/. (1986) Mo/. Gen, Genet. 202:302; 
Hansenula; Das e/ a/. (1984)7. Bacteriol 755:1165; DeLouvencourt era/. (1983)7. 
Bacteriol 754:1165; Van den Berg a/. (1990) Bio/Technology 8:135; 

30 Kluyveromyces;Cregge/a/. (1985) Mo/. Cell Biol 5:3376; Kunze a/. (1985)7 
Basic Microbiol 25:141; US Patent Nos. 4,837,148 and 4,929,555; Pichia; Hinnen et 
al (1978) Proc. Natl Acad. ScL USA 75;l929;lto etal (1983)7 Bacteriol 
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; 55:163 Saccharomyces; Beach and Nurse (1981) Nature 300:706; 
Schizosaccharomyces; Davidow et al (1985) Curr. Genet. 70:39; Gaillardin et aL 
(1985) Cwrr. Genet. 70:49; Yarrowia). 

Bacterial expression techniques are known in the art. A bacterial promoter is 
5 any DNA sequence capable of binding bacterial RNA polymerase and initiating the 
downstream (3') transcription of a coding sequence {e.g., structural gene) into mRNA. 
A promoter will have a transcription initiation region which is usually placed proximal 
to the 5* end of the coding sequence. This transcription initiation region usually 
includes an RNA polymerase binding site and a transcription initiation site. A bacterial 

10 promoter may also have a second domain called an operator, that may overlap an 

adjacent RNA polymerase binding site at which RNA synthesis begins. The operator 
permits negative regulated (inducible) transcription, as a gene repressor protein may 
bind the operator and thereby inhibit transcription of a specific gene. Constitutive 
expression may occur in the absence of negative regulatory elements, such as the 

15 operator. In addition, positive regulation may be achieved by a gene activator protein 
binding sequence, which, if present is usually proximal (5') to the RNA polymerase 
binding sequence. An example of a gene activator protein is the catabolite activator 
protein (CAP), which helps initiate transcription of the lac operon in Escherichia coli 
(E. coli)(Raibaudefa/. {19^4) Annu. Rev. Genet. 18:113). Regulated expression 

20 may therefore be either positive or negative, thereby either enhancing or reducing 
transcription. 

Expression and transformation vectors, either extra-chromosomal replicons or 
integrating vectors, have been developed for transformation into many bacteria. For 
example, expression vectors have been developed for, inter alia^ the following bacteria: 

25 Bacillus subtilis(Palva era/. (1982) Proc. Natl Acad. ScL L^S^ 79:5582; EP-A-0 
036 259 and EP-A-0 063 953; WO 84/04541), Escherichia coli (Shimatake et al. 
(19S\) Nature 292:l2S;AmQmetal. (mS) Gene 40:183; Sindior et al. (1986)/ 
MoL Biol. 189:113; EP-A-0 036 776,EP-A-0 136 829 and EP-A-0 136 907), 
Streptococcus cremoris (Powell a/. {19SS) Appl. Environ. Microbiol. 54:655); 

30 Streptococcus lividans (Powell e/ a/. (l9iS) AppL Environ. Microbiol. 54:655), 
Streptomyces lividans (US patent 4,745,056). 
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Methods of introducing exogenous DNA into bacterial hosts are well-known in 
the art, and usually include either the transformation of bacteria treated with CaClj or 
other agents, such as divalent cations and DMSO. DNA can also be introduced into 
bacterial cells by electroporation. Transformation procedures usually vary with the 
5 bacterial species to be transformed, (See e,g,, Masson et al (1989) FEMS Microbiol 
Lett 60:273;PaW^etaL (mi) Proa Natl Acad, ScL 7P:5582; EP-A-O 036 
259 and EP-A-0 063 953; WO 84/04541, Bacillus, Miller et al (1988) Proc. Natl 
Acad. ScL 55:856; Wang a/. (1990)/. Bacteriol 7 72:949; Campylobacter, Cohen 
etal (1973) Proc. Natl Acad, ScL (JP:2110; Dower e/a/. (19SS) Nucleic Acids Res. 

10 76:6127; Kushner (1978) "An improved method for transformation of Escherichia coli 
with ColEl -derived plasmids. In Genetic Engineering: Proceedings of the 
International Symposium on Genetic Engineering (eds. H.W. BoyerandS. Nicosia); 
Mmdeletal (1970) J. Mol Biol 55:159; Taketo (1988) Biophys. Acta 

9^9:318; Escherichia; Chassyef a/. (19B7) FEMS Microbiol Lett. 44:113 

15 Lactobacillus; Fiedler a/. {\9%%)Aruil AiocAem 770:38, Pseudomonas; Augustine^ 
al (1990) FEMS Microbiol Lett 66:203, Staphylococcus, Barany a/. (1980)7. 
Bacteriol I44:69i; Harlander (1987) "Transformation of Streptococcus lactis by 
. electroporation, in: Streptococcal Genetics (ed. J. Ferretti and R. Curtiss HI); Perry et 
al (1981) Infect Immun. i2: 1295; Powell a/. {\9%%)Appl Environ, Microbiol 

20 5-^:655; Somkuti et al. (1987) Proc. 4th Evr. Cong. Biotechnology 7:412, 
Streptococcus). 

In addition, viral antigens can be expressed in insect cells by the Baculovirus 
system. A general guide to Baculovirus expression by Summer and Smith is A Manual 
of Methods for Baculovirus Vectors and Insect Cell Culture Procedures (Texas 

25 Agricultural Experiment Station Bulletin No. 1 555). To incorporate the heterologous 
gene into the Baculovirus genome the gene is first cloned into a transfer vector 
containing some Baculovirus sequences. This transfer vector, when it is cotransfected 
with wild-type virus into insect cells, will recombine with the wild-type virus. Usually, 
the transfer vector will be engineered so that the heterologous gene will disrupt the 

30 wild-type Baculovirus polyhedron gene. This disruption enables easy selection of the 
recombinant virus since the cells infected with the recombinant virus will appear 
phenotypically different from the cells infected with the wild-type virus. The purified 
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recombinant virus can be used to infect cells to express the heterologous gene. The 
foreign protein can be secreted into the medium if a signal peptide is linked in frame to 
the heterologous gene; otherwise, the protein will be bound in the cell lysates. For 
further information, see Smith et al MoL & Cell. Biol. 3:2156-2165 (1983) or Luckow 
5 and Summers in Virology 17: 31-39 (1989), 

Baculovirus expression can also be affected in plant cells. There are many plant 
cell culture and whole plant genetic expression systems known in the art. Exemplary 
plant cellular genetic expression systems include those described in patents, such as: 
US 5,693,506; US 5,659,122; and US 5,608,143. Additional examples of genetic 

10 expression in plant cell culture has been described by Zenk, Phytochemistry 30:3861- 
3863 (1991). Descriptions of plant protein signal peptides may be found in addition to 
the references described above in Vaulcombe et al., M?/. Gen, Genet, 209:33-40 
(1987); Chandler etal.,P/a/i/A/ofectt/ar5/o/os; 3:407-418 BioL 
Chem. 260:3731-3738 (1985); Rothstein et al. Gene 55:353-356 (1987); Whittier et 

15 al.. Nucleic Acids Research 15:2515-2535 (1987); Wirsel et aL, Molecular 

Microbiology 3:3-14 (1989); Yu et al.. Gene 122:247-253 (1992). A description of the 
regulation of plant gene expression by the phytohormone, gibberellic acid and secreted 
enzymes induced by gibberelHc acid can be found in R.L. Jones and J. MacMillin, 
Gibherellins: in: Advanced Plant Physiology^, MalcohnB. Wilkins,ed., 1984 Pitman 

20 Publishing Limited, London, pp. 21-52. References that describe other metabolically- 
regulated genes: Sheen, Plant Cell, 2:1027-1038(1990); Maas et aL, EMBO J, 9:3447- 
3452 (1990); Benkel and Hickey,Pr(?c. Natl. Acad Sci, 84:1337-1339(1987). 

All plants from which protoplasts can be isolated and cultured to give whole 
regenerated plants can be transformed by the present invention so that whole plants are 

25 recovered which contain the transferred gene. It is known that practically all plants can 
be regenerated from cultured cells or tissues, including but not limited to all major 
species of sugarcane, sugar beet, cotton, fruit and other trees, legumes and vegetables. 
Some suitable plants include, for example, species from the genera Fragaria, Lotus, 
Medicago, OnobrychiSy Trifolium, Trigonella, Vigna, Citrus, Linum, Geranium, 

30 Manihot, Daucus, Arabidopsis, Brassica, Raphanus, Sinapis, Atropa, Capsicum, 

Datura, Hyoscyamus, Lycopersion, Nicotiana, Solanum, Petunia, Digitalis, Majorana, 
Cichorium, Helianthus, Lactuca, Bromus, Asparagus, Antirrhinum, Hererocallis, 
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Nemesia, Pelargonium, Panicum, Pennisetuniy Ranunculus^ SeneciOy SalpiglossiSy 
CucumiSy Browaalia, Glycine, Lolium, Zea, Triticum, Sorghum, and Datura, 

Transformation can be by any method for introducing polynucleotides into a 
host cell, including, for example packaging the polynucleotide in a virus and 
5 transducing a host cell with the virus, and by direct uptake of the polynucleotide. The 
transformation procedure used depends upon the host to be transformed. Bacterial 
transformation by direct uptake generally employs treatment with calcium or rubidium 
chloride (Cohen (1972), Proc. Natl. Acad. Sci, U.S.A. 69:21 10; Maniatis et al. (1982), 
MOLECULAR CLONING; A LABORATORY MANUAL (Cold Spring Harbor Press, 

10 Cold Spring Harbor, N.Y.). Yeast transformation by direct uptake may be carried out 
using the method of Hinnen et al. (1978) Proc. Natl. Acad. Sci. U.S.A. 75: 1929. 
Mammalian transformations by direct uptake may be conducted using the calcium 
phosphate precipitation method of Graham and Van der Eb (1978), Virology 52:546 or 
the various known modifications thereof. 

1 5 Vector construction employs techniques which are known in the art. Site- 

specific DNA cleavage is performed by treating with suitable restriction enzymes under 
conditions which generally are specified by the manufacturer of these commercially 
available enzymes. The cleaved fi:agments may be separated using polyacrylamide or 
agarose gel electrophoresis techniques, according to the general procedures found in 

20 Methods in Enzymology (1980) 65:499-560. Sticky ended cleavage fragments may be 
blunt ended using E. coli DNA polymerase I (Klenow) in the presence of the 
appropriate deoxynucleotide triphosphates (dNTPs) present in the mixture. Treatment 
with SI nuclease may also be used, resulting in the hydrolysis of any single stranded 
DNA portions. 

25 Ligations are carried out using standard buffer and temperature conditions using 

T4 DNA ligase and ATP; sticky end ligations require less ATP and less ligase than 
blunt end ligations. When vector fragments are used as part of a ligation mixture, the 
vector fragment is often treated with bacterial alkaline phosphatase (BAP) or calf 
intestinal alkaline phosphatase to remove the 5'-phosphate and thus prevent religation 

30 of the vector; alternatively, restriction enzyme digestion of unwanted fi:agments can be 
used to prevent ligation. Ligation mixtures are transformed into suitable cloning hosts, 
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such as B. coli, and successful transfonnants selected by, for example, antibiotic 
resistance, and screened for the correct construction. 

Synthetic oligonucleotides may be prepared using an automated oligonucleotide 
synthesizer as described by Warner (1984), DNA 3:401. If desired, the synthetic strands 
5 may be labeled with '^P by treatment with polynucleotide kinase in the presence of "P- 
ATP, using standard conditions for the reaction. DNA sequences, including those 
isolated from cDNA libraries, may be modified by known techniques, including, for 
example site directed mutagenesis, as described by ZoUer (1982), Nucleic Acids Res. 
10:6487. 

1 0 The expression constructs of the present invention, including the desired fusion, 

or individual expression constructs comprising the individual components of these 
fusions, may be used for nucleic acid immunization, to activate HCV-specific T cells, 
using standard gene delivery protocols. Methods for gene delivery are known in the 
ait. See, e.g., U.S. Patent Nos. 5,399,346, 5,580,859, 5,589,466. Genes can be 

1 5 delivered either directly to the vertebrate subject or, alternatively, delivered ex vivo, to 
cells derived from the subject and the cells reimplanted in the subject. For example, the 
constructs can be delivered as plasmid DNA, e.g., contained within a plasmid, such as 
pBR322,pUC,orColEl 

Additionally, the expression constructs can be packaged in liposomes prior to 

20 delivery to the cells. Lipid encapsulation is generally accomplished using liposomes 
which are able to stably bind or entrap and retain nucleic acid. The ratio of condensed 
DNA to lipid preparation can vary but will generally be around 1 : 1 (mg 
DNAimicromoles lipid), or more of lipid. For a review of the use of liposomes as 
carriers for dehvery of nucleic acids, see, Hug and Sleight, Biochim. Biophys. Acta. 

25 (1991) 1097:1-17; Straubinger et al., in Methods ofEnzymology (1983), Vol. 101, pp. 
512-527. 

Liposomal preparations for use with the present invention include cationic 
(positively charged), anionic (negatively charged) and neutral preparations, with 
cationic liposomes particularly preferred. Cationic liposomes are readily available. For 
30 example, N[l-2,3-dioleyloxy)propyl]-N,N,N-triethylammonium (DOTMA) liposomes 
are available under the trademark Lipofectin, from GIBCO BRL, Grand Island, NY. 
(See, also, Feigner et al, Proc. Natl Acad. Set USA (1987) 84:7413-7416). Other 
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commercially available lipids include transfectace pOAB/DOPE) and DOTAP/DOPE 
(Boerhinger). Other cationic liposomes can be prepared from readily available 
materials using techniques well known in the art. See, e.g., Szoka et al., Proc. Natl 
Acad. Set USA (1978) 75:4194-4198; PCT Publication No. WO 90/1 1092 for a 
5 description of the synthesis of DOTAP ( 1 ,2-bis(oleoyloxy)-3- 

(trimethylammonio)propane) liposomes. The various liposome-nucleic acid complexes 
are prepared using methods known in the art. See, e.g., Straubinger et al., in 
METHODS OF IMMUNOLOGY (1983), Vol. 101, pp. 512-527; Szoka et al., Proc, 
Natl Acad. ScL USA (1978) 75:4194-4198; Papahadjopoulos et al., Biochim. Biophys. 

10 Acta (1975) 394:483; Wilson et al.. Cell (1979) 12:77); Deamer and Bangham, 

Biochim, Biophys. Acta (1976) 443:629; Ostro et al., Biochem. Biophys, Res. Commun. 
(1977) 76:836; Fraley et al., Proc. Natl Acad. Sci. USA (1979) 76:3348); Enoch and 
Strittmatter, Proc. Natl Acad. Scl USA (1979) 76:145); Fraley et al.,y. Biol Chem. 
(1980) 255:10431; Szoka and Papahadjopoulos, Proc. Natl Acad. ScL USA (1978) 

15 75:145; and Schaefer-Ridder et al.. Science (1982) 215:166. 

The DNA can also be delivered in cochleate lipid compositions similar to those 
described by Papahadjopoulos et al., Biochem. Biophys. Acta. (1975) 394:483-491. 
See, also, U.S. Patent Nos. 4,663,161 and 4,871,488. 

A number of viral based systems have been developed for gene transfer into 

20 mammalian cells. For example, retroviruses provide a convenient platform for gene 
delivery systems, such as murine sarcoma virus, mouse manmiary tumor virus, 
Moloney murine leukemia virus, and Rous sarcoma virus. A selected gene can be 
inserted into a vector and packaged in retroviral particles using techniques known in the 
art. The recombinant virus can then be isolated and delivered to cells of the subject 

25 either in vivo or ex vivo. A number of retroviral systems have been described (U.S. 
Patent No. 5,219,740; Miller and Rosman, BioTechniques (1989) 7:980-990; Miller, 
A.D., Human Gene Therapy (1990) 1:5-14; Scarpa et al., Virology (1991) 180:849-852: 
Bums et al., Proc. Natl Acad. Sci. USA (1993) 90:8033-8037; and Boris-Lawrie and' 
Temin, Cur. Opin. Genet. Develop. (1993) 3:102-109. Briefly, retroviral gene delivery 

30 vehicles of the present invention may be readily constructed from a wide variety of 
retroviruses, including for example, B, C, and D type retroviruses as well as 
spumaviruses and lentivinises such as FIV, HIV, HIV-1, HIV-2 and SIV (see RNA 
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Tumor Viruses, Second Edition, Cold Spring Harbor Laboratory, 1985). Such 
retroviruses may be readily obtained from depositories or collections such as the 
American Type Culture Collection ("ATCC*; 10801 University Blvd., Manassas, VA 
201 10-2209), or isolated from known sources using commonly available techniques. 
5 A number of adenovirus vectors have also been described, such as adenovirus 

Type 2 and Type 5 vectors. Unlike retroviruses which integrate into the host genome, 
adenoviruses persist extrachromosoraally thus minimizing the risks associated with 
insertional mutagenesis (Haj-Ahmad and Graham, J. Virol (1986) 52:267-274; Bett et 
al., J. Virol (1993) 67:591 1-5921 ; Mittereder et al.. Human Gene Therapy (1994) 

10 5:717-729; Seth et al., / Virol (1994) 68:933-940; Barr et al., Gene Therapy (1994) 
1:51-58; Berkner, K.L. BioTechniques (1988) 6:616-629; and Rich et al., Human Gene 
Therapy (1993) 4:461-476). 

Molecular conjugate vectors, such as the adenovirus chimeric vectors described 
in Michael et al., 7. Biol Chem. (1993) 268:6866-6869 and Wagner et al.. Proa Natl 

15 Acad. Sci. USA (1992) 89:6099-6103, can also be used for gene delivery. 

Members of the Alphavirus genus, such as but not limited to vectors derived 
from the Sindbis and Semliki Forest viruses, VEE, will also find use as viral vectors for 
delivering the gene of interest. For a description of Sindbis- virus derived vectors useful 
for the practice of the instant methods, see, Dubensky et al, J, Virol (1996) 70:508- 

20 519; and International Publication Nos. WO 95/07995 and WO 96/17072. 

Other vectors can be used, including but not limited to simian virus 40, 
cytomegalovirus. Bacterial vectors, such as Salmonella ssp. Yersinia enterocolitica. 
Shigella spp., Vibrio cholerae, Mycobacterium strain BCG, and Listeria 
monocytogenes can be used. Minichromosomes such as MC and MCI, bacteriophages, 

25 cosmids (plasmids into which phage lambda cos sites have been inserted) and replicons 
(genetic elements that are capable of replication under their own control in a cell) can 
also be used. 

The expression constructs may also be encapsulated, adsorbed to, or associated 
with, particulate carriers. Such carriers present multiple copies of a selected molecule 
30 to the immune system and promote trapping and retention of molecules in local lymph 
nodes. The particles can be phagocytosed by macrophages and can enhance antigen 
presentation through cytokine release. Examples of particulate carriers include those 
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derived from polymethyl methacrylate polymers, as well as microparticles derived from 
poly(lactides) and poly(lactide-co-glycolides), known as PLG. See, e.g., Jeffery et al., 
Pharm, Res, (1993) 10:362-368; and McGee et al., J. Microencap, (1996). 

A wide variety of other methods can be used to deliver the expression 
5 constructs to cells. Such methods include DEAE dextran-mediated transfection, 

calcium phosphate precipitation, polylysine- or polyomithine-mediated transfection, or 
precipitation using other insoluble inorganic salts, such as strontium phosphate, 
aluminum silicates including bentonite and kaolin, chromic oxide, magnesium silicate, 
talc, and the like. Other useful methods of transfection include electroporation, 

10 sonoporation, protoplast fusion, liposomes, peptoid delivery, or microinjection. See, 
e.g., Sambrook et al., supra, for a discussion of techniques for transforming cells of 
interest; and Feigner, F,L., Advanced Drug Delivery Reviews (1990) 5:163-187, for a 
review of delivery systems useful for gene transfer. One particularly effective method 
of delivering DNA using electroporation is described in International Publication No. 

15 WO/0045823. 

Additionally, biolistic delivery systems employing particulate carriers such as 
gold and tungsten, are especially useful for delivering the expression constructs of the 
present invention. The particles are coated with the construct to be delivered and 
accelerated to high velocity, generally under a reduced atmosphere, using a gun powder 

20 discharge from a ''gene gun." For a description of such techniques, and apparatuses 
useful therefore, see, e.g., U.S. Patent Nos. 4,945,050; 5,036,006; 5,100,792; 
5,179,022; 5,371,015; and 5,478,744. 



Compositions 

25 The invention also provides compositions comprising the HCV polypeptides or 

polynucleotides described herein. Such compositions are useful as diagnostics, for 
example, using the mutant polypeptides (or polynucleotides encoding these 
polypeptides) in diagnostic reagents. Diagnostics using polypeptides and 
polynucleotides are known to those of skill in the art. 

30 In addition, immunogenic compounds can be prepared from one or more 

immunogenic polypeptides derived from the polypeptides described herein, for 
example the ANS35 polypeptide. The preparation of immunogenic compounds which 
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contain immunogenic polypeptide(s) as active ingredients is known to one skilled in the 
art. Typically, such immunogenic compounds are prepared as injectables,- either as 
liquid solutions or suspensions; solid forms suitable for solution in, or suspension in, 
liquid prior to injection can also be prepared. The preparation can also be emulsified, or 
5 the protein encapsulated in liposomes. 

Immunogenic and diagnostic compositions of the invention preferably comprise 
a pharmaceutically acceptable carrier. The carrier should not itself induce the 
production of antibodies harmful to the host. Pharmaceutically acceptable carriers are 
well known to those in the art. Such carriers include, but are not limited to, large, 

10 slowly metabolized, macromolecules, such as proteins, polysaccharides such as latex 
fimctionalized sepharose, agarose, cellulose, cellulose beads and the like, polylactic 
acids, polyglycolic acids, polymeric amino acids such as polyglutamic acid, polylysine, 
and the like, amino acid copolymers, and inactive virus particles. 

Pharmaceutically acceptable salts can also be used in compositions of the 

IS invention, for example, mineral salts such as hydrochlorides, hydrobromides, 

phosphates, or sulfates, as well as salts of organic acids such as acetates, proprionates, 
malonates, or benzoates. Especially useful protein substrates are serum albumins, 
keyhole limpet hemocyanin, immunoglobulin molecules, thyroglobulin, ovalbumin, 
tetanus toxoid, and other proteins well known to those of skill in the art. Compositions 

20 of the invention can also contain liquids or excipients, such as water, saline, glycerol, 
dextrose, ethanol, or the like, singly or in combination, as well as substances such as 
wetting agents, emulsifying agents, or pH buffering agents. Liposomes can also be 
used as a carrier for a composition of the invention, such liposomes are described 
above. 

25 If desired, co-stimulatory molecules which improve immunogen presentation to 

lymphocytes, such as B7-1 or B7-2, or cytokines such as GM-CSF, IL-2, and IL-12, 
can be included in a composition of the invention. Optionally, adjuvants can also be 
included in a composition. Adjuvants which can be used include, but are not limited to: 
(1) aluminum salts (alum), such as aluminum hydroxide, aluminum phosphate, 

30 aluminum sulfate, etc; (2) oil-in-water emulsion formulations (with or without other 
specific immunostimulating agents such as muramyl peptides (see below) or bacterial 
cell wail components), such as for example (a) MF59 (PCT Publ. No. WO 90/14837), 
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containing 5% Squalene, 0.5% Tween 80, and 0.5% Span 85 (optionally containing 
various amounts of MTP-PE ), formulated into submicron particles using a 
microfluidizer such as Model 1 lOY microfluidizer (Microfluidics, Newton, MA), 
(b) SAF, containing 10% Squalane, 0.4% Tween 80, 5% pluronic-blocked polymer 
5 L121, and thr-MDP (see below) either microfluidized into a submicron emulsion or 
vortexed to generate a larger particle size emulsion, and (c) Ribi^*^ adjuvant system 
(RAS), (Ribi Immunochem, Hamilton, MT) containing 2% Squalene, 0.2% Tween 80, 
and one or more bacterial cell wall components from the group consisting of 
monophosphorylipid A (MPL), trehalose dimycolate (TDM), and cell wall skeleton 

10 (CWS), preferably MPL + CWS (Detox™); (3) saponin adjuvants, such as Stimulon™ 
(Cambridge Bioscience, Worcester, MA) may be used or particles generated therefrom 
such as ISCOMs (immunostimulating complexes); (4) Complete Freund*s Adjuvant 
(CPA) and Incomplete Freund's Adjuvant (IFA); (5) cytokines, such as interleukins 
(e.g., IL-1, IL-2, IL-4, IL-5, IL-6, IL-7, IL-12, e/c), interferons (e.g., gamma 

1 5 interferon), macrophage colony stimulating factor (M-CSF), tumor necrosis factor 
(TNF), etc; (6) detoxified mutants of a bacterial ADP-ribosylating toxin such as a 
cholera toxin (CT), a pertussis toxin (PT), or an E. coli heat-labile toxin (LT), 
particularly LT-K63, LT-R72, CT-S109, PT-K9/Gi29; see, e.g., WO 93/13302 and 
WO 92/19265; (7) other substances that act as immunostimulating agents to enhance 

20 the effectiveness of the composition; and (8) microparticles with adsorbed 

macromolecules, as described in copending U.S. Patent Application Serial No. 
09/285,855 (filed April 2, 1999) and international Patent Application Serial No. 
PCT/US99/17308 (filed July 29, 1999). Alum and MF59 are preferred. The 
effectiveness of an adjuvant can be determined by measuring the amount of antibodies 

25 directed against an immunogenic polypeptide containing an HCV antigenic sequence 
resulting from administration of this polypeptide in immunogenic compounds which 
are also comprised of the various adjuvants. 

As mentioned above, muramyl peptides include, but are not limited to, N- 
acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP), -acetyl-normuramyl-L-alanyl- 

30 D-isoglutamine (CGP 1 1637, referred to nor-MDP), N-acetylmuramyl-L-alanyl-D- 
isoglutaminyl-L-alanine-2-(r-2-dipalmitoyI-5/i-glycero-3-hydroxyphosphoryloxy)- 
ethylamine (CGP 19835 A, referred to as MTP-PE), etc. 
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Thus, such recombinant or synthetic HCV polypeptides can be used in vaccines 
and as diagnostics. Further, antibodies raised against these polypeptides can also be 
used as diagnostics, or for passive immunotherapy. In addition, antibodies to these 
polypeptides are useful for isolating and identifying HCV particles. 
5 Native HCV antigens can also be isolated from HCV virions. The virions can be 

grown in HCV infected cells in tissue culture, or in an infected host. 

Administration and Delivery 

The polynucleotide and polypeptide compositions described herein (eg., 
1 0 immunogenic compounds) may be admmistered to a subject using any suitable delivery 
means. Methods of delivering nucleic acids into host cells are discussed above. 
Further, HCV polynucleotides and/or polypeptides can be administered parenterally, by 
injection, usually, subcutaneously, intramuscularly, transdermally or transcutaneously. 
Certain adjuvants, e.g. LTK63, LTR72 or PLG formulations, can be administered 
1 S intranasally or orally. Additional formulations which are suitable for other modes of 
administration include suppositories. For suppositories, traditional binders and carriers 
can include, for example, polyalkylene glycols or triglycerides; such suppositories can 
be formed from mixtures containing the active ingredient in the range of 0.5% to 10%, 
preferably l%-2%. Other oral formulations include such normally employed excipients 
20 as, for example, pharmaceutical grades of mannitol, lactose, starch, magnesium 
stearate, sodium saccharine, cellulose, magnesium carbonate, and the Hke. These 
compositions take the form of solutions, suspensions, tablets, pills, capsules, sustained 
release formulations or powders and contain 10%-95% of active ingredient, preferably 
25%-70%. 

25 The polypeptides of the present invention can be formulated into the 

immunogenic compound as neutral or salt forms. Pharmaceutically acceptable salts 
include the acid addition salts (formed with free amino groups of the peptide) and 
which are formed with inorganic acids such as, for example, hydrochloric or 
phosphoric acids, or such organic acids such as acetic, oxalic, tartaric, maleic, and the 

30 like. Salts formed with the free carboxyl groups can also be derived from inorganic 
bases such as, for example, sodium, potassium, ammoniiun, calcium, or ferric 
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hydroxides, and such organic bases as isopropylamine, trimethylamine, 2-ethylamino 
ethanol, histidine, procaine, and the like. 

The immunogenic compounds are administered in a manner compatible with the 
dosage formulation, and in such amount as will be prophylactically and/or 
5 therapeutically effective. The quantity to be administered, which is generally in the 
range of 5 micrograms to 250 micrograms of polypeptide per dose, depends on the 
subject to be treated, capacity of the subject's immune system to synthesize antibodies, 
and the degree of protection desired. Precise amounts of active ingredient required to be 
administered may depend on the judgment of the practitioner and can be pecuUar to 

10 each subject. 

The immunogenic compound can be given in a single dose schedule, or 
preferably in a multiple dose schedule. A multiple dose schedule is one in which a 
primary cotirse of vaccination can be with 1-10 separate doses, followed by other doses 
given at subsequent time intervals required to maintain and or reenforce the immime 

IS response, for example, at 1-4 months for a second dose, and if needed, a subsequent 
dose(s) after several months. Further, the course of administration may include 
polynucleotides and polypeptides, together or sequentially (for example, priming with a 
polynucleotide composition and boosting with a polypeptide composition). The dosage 
regimen will also, at least in part, be determined by the need of the individual and be 

20 dependent upon the judgment of the practitioner. 

In certain embodiments, administration of the polynucleotides and polypeptides 
described herein is used to activate T cells. In addition to the practical advantages of 
simplicity of construction and modification, administration of polynucleotides encoding 
mutant NS polypeptides results in the synthesis of a mutant NS polypeptide in the host. 

25 Thus, these immunogens are presented to the host immune system with native post- 
translational modifications, structure, and conformation. The polynucleotides are 
preferably injected intramuscularly to a large mammal, such as a human, at a dose of 
0.5, 0.75, 1.0, 1.5, 2.0, 2.5, 5 or 10 mg/kg. 

The proteins and/or polynucleotides can be administered either to a mammal 

30 which is not infected with an HC V or can be administered to an HC V-infected 
mammal. The particular dosages of the polynucleotides or fusion proteins in a 
composition or will depend on many factors including, but not limited to the species. 



-48- 



wo 01/38360 



PCT/USOO/32326 



age, and general condition of the mammal to which the composition is administered, 
and the mode of administration of the composition. An effective amount of the 
composition of the invention can be readily determined using only routine 
experimentation. In vitro and in vivo models can be employed to identify appropriate 
5 doses. Generally, 0.5, 0.75, 1.0, 1.5, 2.0, 2.5, 5 or 10 mg will be administered to a large 
mammal, such as a baboon, chimpanzee, or human. If desired, co-stimulatory 
molecules or adjuvants can also be provided before, after, or together with the 
compositions. 

1 0 Antibodies and Diagnostics 

Antibodies, both monoclonal and polyclonal, which are directed against HCV 
epitopes are particularly useful in diagnosis, and those which are neutralizing are useful 
in passive immunother^y. Monoclonal antibodies, in particular, may be used to raise 
anti-idiotype antibodies. 

IS Anti-idiotype antibodies are immunoglobulins which carry an "internal image" 

of the antigen of the infectious agent against which protection is desired. Techniques 
for raising anti-idiotype antibodies are known in the art. See, e.g., Grzych (1985), 
Nature 316:74; MacNamara et al. (1984), Science 226:1325, Uytdehaag et al (1985), J. 
Immunol. 134: 1225. These anti-idiotype antibodies may also be useful for treatment 

20 and/or diagnosis of NANBH, as well as for an elucidation of the immunogenic regions 
of HCV antigens. 

An immunoassay for viral antigen may use, for example, a monoclonal antibody 
directed towards a viral epitope, a combination of monoclonal antibodies directed 
towards epitopes of one viral polypeptide, monoclonal antibodies directed towards 

25 epitopes of different viral polypeptides, polyclonal antibodies directed towards the 

same viral antigen, polyclonal antibodies directed towards different viral antigens or a 
combination of monoclonal and polyclonal antibodies. 

Immunoassay protocols may be based, for example, upon competition, or direct 
reaction, or sandwich type assays. Protocols may also, for example, use solid supports, 

30 or may be by immunoprecipitation. Most assays involve the use of labeled antibody or 
polypeptide. The labels may be, for example, fluorescent, chemiluminescent, 
radioactive, or dye molecules. Assays which ampHfy the signals from the probe are also 
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known. Examples of which are assays which utilize biotin and avidin, and enzyme- 
labeled and mediated immunoassays, such as ELISA assays. 

An enzyme-linked immunosorbent assay (ELISA) can be used to measure either 
antigen or antibody concentrations. This method depends upon conjugation of an 
5 enzyme to either an antigen or an antibody, and uses the bound enzyme activity as a 
quantitative label. To measure antibody, the known antigen is fixed to a sohd phase 
(e.g., a microplate or plastic cup), incubated with test serum dilutions, washed, 
incubated with anti-immunoglobulin labeled with an enzyme, and washed again. 
Enzymes suitable for labeling are known in the art, and include, for example, 
10 horseradish peroxidase. Enzyme activity bound to the solid phase is measured by 

adding the specific substrate, and determining product formation or substrate utilization 
colorimetrically. The enzyme activity bound is a direct fimction of the amount of 
antibody bound. 

To measure antigen, a known specific antibody is fixed to the solid phase, the 
15 test material containing antigen is added, after an incubation the solid phase is washed, 
and a second enzyme-labeled antibody is added. After washing, substrate is added, and 
enzyme activity is estimated colorimetrically, and related to antigen concentration. 

The HCV fixsion proteins, such as NS3 mutant and core fiision proteins, can 
also be used to produce HCV-specific polyclonal and monoclonal antibodies. HCV- 
20 specific polyclonal and monoclonal antibodies specifically bind to HCV antigens. 

Polyclonal antibodies can be produced by administering the fiision protein to a 
mammal, such as a mouse, a rabbit, a goat, or a horse. Serum fi-om the inununized 
animal is collected and the antibodies are purified firom the plasma by, for example, 
precipitation with ammonium sulfate, followed by chromatography, preferably affinity 
25 chromatography. Techniques for producing and processing polyclonal antisera are 
known in the art. 

Monoclonal antibodies directed against HCV-specific epitopes present in the 
fiision proteins can also be readily produced. Normal B cells fi-om a mammal, such as a 
mouse, immunized with, e.g., a mutant NS3 polypeptide or NS-core fiision protein can 
30 be fiised with, for example, HAT-sensitive mouse myeloma cells to produce 

hybridomas. Hybridomas producing HCV-specific antibodies can be identified using 
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RIA or ELIS A and isolated by cloning in semi-solid agar or by limiting dilution. 
Clones producing HCV-specific antibodies are isolated by another round of screening. 

Antibodies, either monoclonal and polyclonal, which are directed against HCV 
epitopes, are particularly useful for detecting the presence of HCV or HCV antigens in 
5 a sample, such as a serum sample from an HCV-infected human. An immunoassay for 
an HCV antigen may utilize one antibody or several antibodies. An immunoassay for 
an HCV antigen may use, for example, a monoclonal antibody directed towards an 
HCV epitope, a combination of monoclonal antibodies directed towards epitopes of one 
HCV polypeptide, monoclonal antibodies directed towards epitopes of different HCV 

10 polypeptides, polyclonal antibodies directed towards the same HCV antigen, polyclonal 
antibodies directed towards different HCV antigens, or a combination of monoclonal 
and polyclonal antibodies. Immunoassay protocols may be based, for example, upon 
competition, direct reaction, or sandwich type assays using, for example, labeled 
antibody. The labels may be, for example, fluorescent, chemiluminescent, or 

IS radioactive. 

The polyclonal or monoclonal antibodies may further be used to isolate HCV 
particles or antigens by immunoaffinity columns. The antibodies can be afiRxed to a 
solid support by, for example, adsorption or by covalent linkage so that the antibodies 
retain their immunoselective activity. Optionally, spacer groups may be included so 
20 that the antigen binding site of the antibody remains accessible. The inunobilized 

antibodies can then be used to bind HCV particles or antigens from a biological sample, 
such as blood or plasma. The bound HCV particles or antigens are recovered from the 
column matrix by, for example, a change in pH. 

25 Methods of Eliciting Immune Responses 

HCV-specific T cells that are activated by the above-described polypeptides, 
expressed in vivo or in vitro preferably recognize an epitope of an HCV polypeptide 
such as a mutant NS3 polypeptide, including an epitope of a mutant HCV polypeptide. 
HCV-specific T cells can be CD8^ or CD4^ 
30 HCV-specific CD8^ T cells preferably are cytotoxic T lymphocytes (CTL) 

which can kill HCV-infected cells that display NS3, NS4, NSSa, NS5b epitopes 
complexed with an MHC class I molecule. HCV-specific CDS* T cells may also 
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express interferon-y (IFN-y). HCV-specific CD8* T cells can be detected by, for 
example, ^'Cr release assays. ^'Cr release assays measure the ability of HCV-specific 
CDS"^ T cells to lyse target cells displaying an nonstructural {e.g., mutant NS) epitope. 
HCV-specific CD8* T cells which express IFN-y ^so be detected by 
5 immunological methods, preferably by intracellular staining for IFN-y after in vitro 
stimulation with a mutant NS polypeptide. 

HCV-specific CD4^ cells activated by the above-described polypeptides, 
expressed in vivo or in vitro, and combinations of the individual components of these 
proteins, preferably recognize an epitope of a mutant non-structural polypeptide, 

10 including an epitope of a mutant protein, that is bound to an MHC class n molecule on 
an HCV-infected cell and proliferate in response to stimulating mutant peptides. 

HCV-specific CD4* T cells can be detected by a lymphoproliferation assay. 
Lymphoproliferation assays measure the ability of HCV-specific CD4'' T cells to 
proliferate in response to an epitope. 

1 5 Mutant NS (or fiisions thereof with core, envelope or other viral polypeptides) 

can be used to activate HCV-specific T cells either in vitro or in vivo. Activation of 
HCV-specific T cells can be used, inter alia, to provide model systems to optimize 
CTL responses to HCV and to provide prophylactic or therapeutic treatment against 
HC V infection. For in vitro activation, proteins are preferably supplied to T cells via a 

20 plasmid or a viral vector, such as an adenovirus vector, as described above. 

Polyclonal populations of T cells can be derived fi^om the blood, and preferably 
bom peripheral lymphoid organs, such as lymph nodes, spleen, or thymus, of manunals 
that have been infected with an HCV. Preferred mammals include mice, chimpanzees, 
baboons, and humans. The HCV serves to expand the number of activated HCV- 

25 specific T cells in the mammal. The HCV-specific T cells derived fi-om the manmial 
can then be restimulated in vitro by adding HCV epitopic peptides to the T cells. The 
HCV-specific T cells can then be tested for, inter alia, proliferation (e.g,. 
lymphoproliferation assays known in the art), the production of IFN-Y> and the ability 
to lyse target cells displaying HCV NS epitopes in vitro. 

30 
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The following examples are meant to illustrate the invention and are not meant 
to limit it in any way. Those of ordinary skill in the art will recognize modifications 
within the spirit and scope of the invention as set forth herein. 

5 EXAMPLES 
Example 1: Constructs 

pCMV-n : pCMV-II (Figure 7, SEQ ID NO:5) was created to contain the human 
CMV promoter, enhancer, intron A, polylinker and the bovine growth hormone 

10 terminator in a deleted-pUC backbone (Life Technologies). 

pT7-HCV : pT7-HCV was created in a polylinker-modified pUC vector to 
contain full-length HCV cDNA preceded by a synthetic T7 promoter. pT7-HCV also 
contains the complete 5' UTR and the poly A version of the 3' UTR. 

pCMV.ANS3S : To generate pCMV,ANS35 (Figure 5, SEQ ID N0:3), a two 

15 step procedure was undertaken. First, a PGR product was generated from pT7-HCV 
that corresponded to the following: a 5* EcoRI site, followed by the Kozak sequence of 
ACCATGG; the initiator ATG followed by amino acid #1242 and continuing to the 
StuI site. Second, the StuI to Xbal fragment from a full-length genomic clone was 
isolated. The genomic clone consisted of the T7 promoter fused to the full-length HCV 

20 cDNA with the poly A version of the 3' end, in a pUC vector. Finally, the EcoRI-StuI 
and Stul-Xbal fragments were ligated into the pCMV-II expression vector, transformed 
into HBlOl competent cells and plated onto ampiciUin (100 fig/ml). Miniprep analyses 
led to the identification of the desired clone which was amplified on a larger scale using 
a Quigen Gigaprep kit following the manufacturer's specifications. The resulting clone 

25 was named pCMV.ANS35 (Figure 5, SEQ ID NO:3). 

pd,ANS3NSS : As shown schematically in Figure 10, the yeast expression 
plasmid pd.ANS3NS5 (SEQ ID N0:8) was constructed using restriction fragments 
obtained from the marrunalian expression plasmid pCMV.KM.ANS35. 
pCMV.KM.ANS35 is identical to pCMV.ANS35 (Figure 5, SEQ ID N0:3) except that 

30 it contains a kanamycin resistance gene m the viral backbone. pCMV.KM.ANS35 was 
digested with EcoRI and Nhel to obtain 2895bp EcoRI-Nhel fragment. EcoRI-Nhel 



-53- 



wo 01/38360 



PCT/USOO/32326 



fragment was ligated into pRSET HindlH-Nhel subcloning vector with oligos (HE) 
from Hindlll to EcoRI. After sequence verification, pRSETHindlU-Nhel #6 was 
digested with Hindin and Nhel to obtain a 2908bp Hindlll-Nhel fragment. 

pCMV.KM.ANS35 was linearized with Xbal and ligated with synthetic oligos 
5 (XS) from Xbal-SalL The ligation was digested with Nhel and Sail to obtain 248 Ibp 
Nhel-Sall fragment. The fragment was ligated into pET3a Nhel-Sall subcloning 
vector. After sequence verification, pET3a Nhel-Sall #2 was digested with Nhel and 
Sail to obtain a 2481bp Nhel-Sall fragment. BamHI-HindHI ADH2/GAPDH promoter 
fragment was then ligated with Hindlll-Nhel and Nhel-Sall fragments into pBS24.1 

1 0 BamHI-Sall yeast expression vector. 

pd.ANS3NSS.PJ: pd,ANS3NSS.PJ (Figures 13 and 14; SEQ ID NO:10) was 
generated to create a "perfect junction" at the 5' and 3' end of the HCV coding region. 
At the 5* end of pd.ANS3NS5, there were 6 extra bases between the yeast 
ADH2/GAPDH promoter and the ATG of the polypeptide. At the 3' end, there were 52 

IS bases of untranslated sequence between the stop codon of the polypeptide and the a- 
factor terminator in the yeast expression vector. pd.ANS3NS5.PJ was created by 
digesting pd.ANS3NS5 #17 with Seal and SphI to obtain 4963bp Scal-SphI fragment. 
pd.NS5b301 1 was digested with SphI and Sail to obtain a 321bp Sphl-Sall fragment 
which gave the "perfect junction" at the 3' end of the polypeptide. The Scal-Sphl and 

20 Sphl-Sall fragments were ligated into pSP72 Hindlll-Sall subcloning vector with 
synthetic oligos from HindIII-ScaI(HS) for the ''perfect junction" at the 5' end. 

The region of synthetic sequence in pSP72 HindlH-Sall clone# 6 was verified. 
pSP72 Hindffl-Sall clone#6 was digested with Hindm and Blnl or with Bhil and Sail 
to obtam 244 Ibp HmdlU-Bhil and 2895bp Bhil-Sall fragments, respectively. The 

25 BamHI-Hindm ADH2/GAPDH promoter fragment was ligated to HindlH-Bhil and 
Blnl-Sall fragments into pBS24.1 BamHI-Sall yeast expression vector. 

pd.ANS3NSS.PJ.corel21RT and pd.A NS3NSS.PJ.corel73RT were generated 
and encode HCV core aa 1-121 at the C-terminus of the ANS3NS5 polypeptide 
(designated pd.ANS3NS5.PJ.corel21RT, SEQ ID N0:12) and core aa 1-173 at the C- 

30 terminus of the ANS3NS5 polypeptide (designated pd.ANS3NS5.PJ.corel73RT, SEQ 
ID NO: 14). The core sequence had aa 9 mutated from Lys to Arg and aa 1 1 mutated 
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from Asn to Thr, designated as core 121RT or 173RT. 

Dd.ANS3NS5.PJ.corel21RT and pd,ANS3NS5.PJ.corel73RT : To generate 
pd.ANS3NS5.PJxorel21RT (Figure 17, SEQ ID N0:12) and 
pd.ANS3NS5.PJ.corel73RT (Figure 18, SEQ ID N0:14). As shown in Figure 16, a 
5 Nott-Sal HCVcorel21RT and HCVcorel73RT were amplified by PGR, from an coli 
expression plasmid, pSODCF2.HCVcorel91RT #2. Either the core 121RT Not-Sall 
PGR product or the core I73RT Not-Sall PGR product were Hgated into a pT7Blue2 
Pstl-Sall subcloning vector with synthetic oligos (PN) from PstI to Notl. After 
sequence confirmation, pT7Blue2corel21RT clone#9 and pT7Blue2corel73RT 

10 clone#l 1 was digested with PstI and Sail to obtam 403bp and 559bp Pstl-Sall 
fragments, respectively, for ftirther cloning. 

A 121bp Notl-PstI fragment from pSP72 Hindlll-Sall clone #6 was isolated as 
described above during the cloning of pd.ANS3NS5.PJ. Notl-PstI and Pstl-Sall 
fragments were assembled into a vector made by digesting pd.NS3NS5.PJ cIone#S 

1 5 (described above) with Notl and Sail. 

z^S3NS5 and Core 140 and Core 150 : An HCV core epitope was found which 
elicits CTLs in baboons (HCV core aa 121-135). Since pd.ANS3NS5.PJ.corel21RT 
ends right before this potentially important epitope and was expressed better than the 
longer pd.ANS3NS5.PJxorel73RT construct (Example 2), two intermediate constructs 

20 were made which include this epitope, possibly giving intermediate expression levels. 
The two new constructs ftised HCV core aa 1-140 or HCV core aal-150 to the C 
terminus of ANS3NS5.PJ. 

Dd.ANS3NS5.PJxorel40RT fFieure 21. SEP ID NO:16^ and 
pd. ANS3NSS,PJ,corel SORT (Figure 22, SEQ ID N0:18): As shown in Figure 20, a 

25 Pstl-Sall HGVcorel40RT and a PstI-SalIHGVcorel50RT fragment were amplified by 
PGR from pd.ANS3NS5.PJxorel73RT clone #16. Ligate either HCV core Pstl-Sall 
PGR products into pT7Blue2 Pstl-Sall subcloning vector. After sequence 
confirmation, pT7Blue2corel40RT clone#22 and pT7Blue2corel50RT clone#26 were 
digested with Pstl-Sall to obtain 460bp and 490bp Pstl-Sall fragments, respectively, for 

30 fiirther cloning. 
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A 121bp Notl-PstI fragment was isolated from pSP72 Hindm-Sall clone #6 (as 
described above during the cloning of pd.ANS3NS5.PJ. Notl-PstI and Pstl-Sall 
fragments were assembled into a vector made by digesting pd.ANS3NS5.PJ clone#5 
(described above) with NotI and Sail. 

5 

Example 2: Protein Expression 

Various of the constructs described herein, encoding HCV-1 ANS3 to NS5 
antigen (aa 1242-301 1), were expressed in yeast. 5. cerevisiae strain AD3 was 
transformed with pd.ANS3NSS and checked for expression. A stamed protein band at 

1 0 the expected molecular weight of 1 94 kD was not observed (Figure 1 2). Strain AD3 
was also transformed with pd.ANS3NS5,PJ clone #5 and checked for expression. A 
protein band of the expected molecular weight of 194kD was detected (Figure 15). 

Strain AD3 was transformed with pd.ANS3NS5.PJ.corel21RT clone #6 and 
pd.ANS3NS5.PJ.corel73RT clone#15 and checked for expression. Protein bands of the 

1 5 expected molecular weight of 206kD and 2 1 OkD, respectively, were observed. 

Expression levels of the pd.ANS3NS5.PJ.corel73RT construct were much less than 
that of the pd.ANS3NS5.PJ.corel21RT construct. (See Figurel9). Thus, there is a 
correlation of protein expression levels and the length of HCV core. 

Strain AD3 were transformed with pd.ANS3NS5.PJ.corel40RT clone# 29 and 

20 pd. ANS3NS5.PJ.corel SORT clone#35 and checked for expression. Bands of the 

expected molecular weights of 208kD and 209kD were seen by stain at levels close to 
those of pd.ANS3NS5corel73RT (Figure 23). 

Example 3: Eliciting Immune Responses 

25 A. Immunization 

To evaluate the immunogenicity of the mutant NS polypeptides, studies using 
guinea pigs, rabbits, mice, rhesus macaques and/or baboons are performed. The studies 
are structured as follows: DNA immunization alone (single or multiple); DNA 
immunization followed by protein immunization (boost); DNA immunization followed 

30 by protein immunization; immunization by PLG particles. Immunization is 
intramuscular or mucosally. 
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B. Humoral Immune Response 

The humoral immune response is checked in serum specimens from immimized 
animals with anti-NS antibody ELISAs (enzyme-linked immunosorbent assays) at 
5 various times post-immunization. Briefly, serum from immunized animals is screened 
for antibodies directed against the NS or mutant NS proteins. Wells of ELISA 
microliter plates are coated overnight with the selected HCV protein and washed four 
times; subsequently, blocking is done with PBS-0.2% Tween (Sigma). After removal 
of the blocking solution, diluted mouse serum is added. Sera are tested at various 

10 dilutions. Microliter plates are washed and incubated with a secondary, peroxidase- 

coupled anti-mouse IgG antibody (Pierce, Rockford, IL). ELISA plates are washed and 
3, 3', 5, 5'-tetramethyl benzidine (TMB; Pierce) is added per well. The optical density 
of each well is measured. Titers are typically reported as the reciprocal of the dilution 
of serum that gave a half-maximum optical density (O.D.). Similarly, generation of 

15 neutralization of binding (NOB) antibodies can be measured by methods known in the 
art. 

C. Cellular Immune Response 

The frequency of specific cytotoxic T-lymphocytes (CTL) is evaluated by a 
20 standard chromium release assay of peptide pulsed Balb/c mouse CD4 cells. Briefly, 
spleen cells (Effector cells, E) are obtained Scorn the BALB/c mice immunized, 
cultured, restimulated, and assayed for CTL activity against HCV peptide-pulsed target 
cells. Cytotoxic activity is measured in a standard ^*Cr release assay. 

25 Example 4: Immunization with PLG-delivered DNA. 

The polylactide-co-glycolide (PLC) polymers are obtained from Boehringer 
Ingelheim, U.S.A. The PLG polymer is RG505, which has a copolymer ratio of 50/50 
and a molecular weight of 65 kDa (manufacturers data). Cationic microparticles with 
adsorbed DNA are prepared using a modified solvent evaporation process, essentially 
30 as described in Singh et al, Proc. Natl Acad. Set USA (2000) 97:81 1-816. Briefly, the 
microparticles are prepared by emulsifying a 5% w/v polymer solution in methylene 
chloride with PBS at high speed using an DCA homogenizer. The primary emulsion is 
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then added to distilled water containing cetyl trimethyl ammonium bromide (CTAB) 
(0.5% w/v). This results in the formation of a w/o/w emulsion which was stirred at 
room temperature, allowing the methylene chloride to evaporate. The resulting 
microparticles are washed in distilled water by centrifugation and freeze dried. 
Following preparation, washing and collection, DNA is adsorbed onto the 
microparticles by incubating cationic microparticles in a solution of DNA. The 
microparticles are then separated by centrifugation, the pellet washed with TE buffer 
and the microparticles are freeze dried, resuspended and administered to animals. 
Antibody titers are measured by ELIS A assays. 
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AVhat is claimed is: 

1 . An isolated mutant non-structural ("NS") HCV polypeptide comprising 
a polypeptide having a mutation in the catalytic domain of NS3, wherein said mutation 

5 functionally disrupts the catalytic domain. 

2. The polypeptide of claim 1, wherein the mutation comprises a deletion. 

3. The polypeptide of claim 1 , wherein the mutation comprises a 
10 substitution. 

4. The polypeptide of any of claims 1-3, wherein said NS polypeptide 
comprises NS3, NS4 and NS5. 

IS 5. The polypeptide of any of claims 1-3, wherein said NS polypeptide 

consists of NS3, NS4 and NS5. 

6. The polypeptide of any of claims 1-3, wherein said NS polypeptide 
consists of NS3 and NS5. 

20 

7. The polypeptide of claim 6, wherein NS5 consists of NS5a. 

8. The polypeptide of claim 6, wherein NS5 consists of NS5b. 

25 9. The polypeptide of any of claims 1-3, wherein said NS polypeptide 

consists of NS3 and NS4. 

10. The polypeptide of claun 9, wherein NS4 consists of NS4a. 

30 11. The polypeptide of claim 9, wherein NS4 consists of NS4b. 
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12. The polypeptide of claim 4, further comprising a second viral 
polypeptide that is not NS3, NS4, or NS5 of HCV. 

13. The polypeptide of claim 12, wherein the second viral polypeptide 
5 comprises an HCV Core polypeptide ("C")j or fragment thereof. 

14. The polypeptide of claim 13, wherein the C polypeptide is truncated. 

15. The polypeptide of claim 14, wherein the truncation is at amino acid 
10 121. 

16. The polypeptide of claim 12, wherein the polypeptide fiirther comprises 
an HCV envelope protein ("E"). 

15 17. The polypeptide of claim 1 6, wherein the E is E 1 . 

1 8 . The polypeptide of claim 1 6, wherein the E is E2. 

19. A composition comprising 

20 (a) the polypeptide of any one of claims 1-18; and 

(b) a pharmaceutically acceptable excipient. 

20. An isolated and purified polynucleotide which encodes the mutant HCV 
polypeptide according to any one of claims 1-18. 

25 

21 . A composition comprising 

(a) the isolated purified polynucleotide of claim 20; and 

(b) a pharmaceutically acceptable excipient. 

30 22. The composition of claim 21, wherein the polynucleotide is DNA. 
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23. The composition of claim 21 , wherein the polynucleotide is in a 
plasmid. 

24. An expression vector comprising the polynucleotide of claim 20. 

5 

25 . An expression vector comprising the polynucleotide of SEQ ID NO:8. 

26. A host cell comprising the polynucleotide of claim 20. 
10 27. The host cell of claim 26, wherein the cell is a yeast cell. 

28. The host cell of claim 26, wherein the cell is a mammalian cell. 

29. The host cell of claim 26, wherein the cell is an insect cell. 

15 

30. The host cell of claim 26, wherein the cell is a plant cell. 

3 1 . The host cell of claim 26, wherein the polynucleotide comprises the 
sequence of SEQ ID N0:8. 

20 

32. The polypeptide of claim I , wherein the polypeptide further comprises 
SEQIDN0:9. 

33. A method of preparing a mutant NS HCV polypeptide, wherein the 
25 method comprises the steps of: 

a. transforming a host cell with an expression vector according to 
claim 24, under conditions wherein the polypeptide is expressed; 
and 



30 



b. isolating the polypeptide. 
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34. The method of claim 33, wherein the host cell is a yeast cell. 

35. The method of claim 33, wherein the host cell is a mammalian cell. 
5 36. The method of claim 33, wherein the host cell is an insect cell. 

37. The method of claim 33, wherein the host cell is a plant cell. 

38. An antibody that specifically binds to a polypeptide of any of claims 1- 

10 18. 

39. The antibody of claim 38, wherein the antibody is a monoclonal 
antibody. 

1 5 40. The antibody of claim 38, wherein the antibody is a purified polyclonal 

antibody. 

41 . A method of eliciting an immime response in a subject, comprising the 
step of administering to the subject a polypeptide of any of claims 1-18. 



20 



42. A method of eliciting an immune response in a subject, comprising the 
step of administering to the subject a polynucleotide of claim 20. 
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1 TCGCGCGTTT CGGTGATGAC GGTGAAAACC TCTGACACAT GCAGCTCCCG GAGACGGTCA CAGCTTGTCT GTAAGCGGAT 
AGCGCGCAAA GCCACTACTG CCACTTTTGG AGACTGTGTA CGTCGAGGGC CTCTGCCAGT GTCGAACAGA CATTCGCCTA 



81 


GCCGGGAGtA GACAAGCCCG TCAGGGCGCG TCAGCGGGTG TTGGCGGGTG TCGGGGCTGG CTTAACTATG 
CGGCCCTCGT CTGTTCGGGC AGTCCCGCGC AGTCGCCCAC AACCGCCCAC AGCCCCGACC GAATTGATAC 


CGGCATCAGA 
GCCGTAGtCT 


I6L 


GCAGATTGTA 
CGTCTAACAT 


StuI 

CTGAGAGTGC ACCATATGAA GCTtrTTGCA AAAGCCTAGG CCTCCAAAAA AGCCTCCTCA 
GACTCTCACG TGGTATACTT CGAAAAACGT TTTCGGATCC GGAGGTTTTT TCGGAGGAGT 


CTACTTCTGG 
GATGAAGACC 


241 


AATAGCTCAG 
TTATCGACTC 


AGGCCGAGGC GGCCTCGGCC TCTGCATAAA TAAAAAAAAT TAGTCAGCCA TGGGGCGGAG 
TCCGGCTCCG CCGGAGCCGG AGACGTATTT ATTTTTTTTA ATCAGTCGGT ACCCCGCCTC 


AATGGGCGGA 
TTACCCGCCT 


321 


ACTGGGCGGG 
TGACCCGCCC 


GAGGGAATTA TTGGCTATTG GCCATTGCAT ACGTTGTATC TATATCATAA TATGTACATT 
CTCCCTTAAT AACCGATAAC CGGTAACGTA TGCAACATAG ATATAGTATT ATACATGTAA 


TATATTGGCT 
ATATAACCGA 


401 


CATGTCCAAT 
GTACAGGTTA 


ATGACCGCCA TGTTGACATT GATTATTGAC TAGTTATTAA TAGTAATCAA TTACGGGGTC 
TACTGGCGGT ACAACTGTAA CTAATAACTG ATCAATAATT ATCATTAGTT AATGCCCCAG 


ATTAGTTCAT 
TAATCAAGTA 


481 


AGCCCATATA 
TCGGGTATAT 


TGGAGTTCCG CGTTACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG CCCAACGACC CCCGCCCATT 
ACCTCAAGGC GCAATGTATT GAATGCCATT TACCGGGCGG ACCGACTGCC 6GGTTGCTGG GGGCGGGTAA 


561 


GACGTCAATA 
CTGCAGTTAT 


ATGACGTATG TTCCCATAGT AACGCCAATA GGGACTTTCC ATTGACGTCA ATGGGTGGAG 
TACTGCATAC AAGGGTATCA TTGCGGTTAT CCCTGAAAGG TAACTGCAGT TACCCACCTC 


TATTTACGGT 
ATAAATGCCA 


641 


AAACTGCCCA 
TTTGACGGGT 


CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTCCGCCC CCTATTGACG TCAATGACGG 
GAACCGTCAT GTAGTTCACA TAGTATACGG TTCAGGCGGG GGATAACTGC AGTTACTGCC 


TAAATGGCCC 
ATTTACCGGG 


721 


GCCTGGCATT 
CGGACCGTAA 


ATGCCCAGTA CATGACCTTA CGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA 
TACGGGTCAT GTACTGGAAT GCCCTGAAAG GATGAACCGT CATGTAGATG CATAATCAGT 


TCGCTATTAC 
AGCGATAATG 


801 


CATGGTGATG 
GTACCACTAC 


CGGTTTTGGC AGTACACCAA TGCGCGTGGA TAGCGGTTTG ACTCACGGGG ATTTCCAAGT 
GCCAAAACCG TCATGTGGTT ACCCGCACCT ATCGCCAAAC TGAGTGCCCC TAAAGGTTCA 


CTCCACCCCA 
GAGGTGGGGT 


881 


TTGACGTCAA 

AACTGCAGTT 


TGGGAGTTTG TTTTGGCACC AAAATCAACG GGACTTTCCA AAATGTCGTA ATAACCCCGC 
ACCCTCAAAC AAAACCGTGG TTTTAGTTGC CCTGAAAGGT TTTACAGCAT TATTGGGGCG 


CCCGTTGACG 
GGGCAAC7GC 


961 


CAAATGGGCG 
GTTTACCCGC 


GTAGGCGTGT ACGGTGGGAG GTCTATATAA GCAGAGCTCG TTTAGTGAAC CCTCAGATCG 
CATCCGCACA TGCCACCCTC CAGATATATT CGTCTCGAGC AAATCACTTG GCACTCTAGC 


CCTGGAGACG 
GGACCTCTGC 


1041 


CCATCCACGC 
GGTAGGTGCG 


TGTTTTGACC TCCATAGAAG ACACCGGGAC CGATCCAGCC TCCGCGGCCG GGAACGGTGC 
ACAAAACTGG AGGTATCTTC TGTGGCCCTG GCTAGGTCGG AGGCGCCGGC CCTTGCCACG 


ATTGGAACGC 
TAACCTTGCG 


1121 


GGATTCCCCG 
CCTAAGGGGC 


TGCCAAGAGT GACGTAAGTA CCGCCTATAG ACTCTATAGG CACACCCCTT TGGCTCTTAT 
ACGGTTCTCA CTGCATTCAT GGCGGATATC TGAGATATCC GTGTGGGGAA ACCGAGAATA 


GCATGCTATA 
CGTACGATAT 


1201 


CTGTTTTTGG 
GACAAAAACC 


CTTGGGGCCt ATACACCCCC GCTCCTTATG CTATAGGTGA TGGTATAGCT TAGCCTATAG 
GAACCCCGGA TATGTGGGGG CGAGGAATAC GATATCCACT ACCATATCGA ATCGGATATC 


GTGTGGGTTA 
CACACCCAAT 


1281 


TTGACCATTA 
AACTGGTAAT 


TTGACCACTC CCCTATTGGT GACGATACTT TCCATTACTA ATCCATAACA TGGCTCTTTG 
AACTGGTGAG GGGATAACCA CTGCTATGAA AGGTAATGAT TAGGTATIGT ACCGAGAAAC 


CCACAACTAT 
GGTGTTGATA 


1361 


CTCTATTGGC 
GAGATAACCG 


TATATGCCAA TACTCTGTCC TTCAGAGACT GACACGGACT CTGTATTTTT ACAGGATGGG 
ATATACGGTT ATGAGACAGG AAGTCTCT6A CTGTGCCTGA GACATAAAAA TGTCCTACCC 


GTCCATTTAT 
; CAGGTAAATA 
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1441 TATTTACAAA TTCACATATA CAACAACGCC GTCCCCCGTG CCCGCAGTTT TTATTAAACA TAGCGTGGGA TCTCCGACAT 
ATAAATGTTT AAGTGTATAT GTTGTTGCGC CAGGGGGCAC GGGCGTCAAA AATAATTTGT ATCGCACCCT AGAGGCTGTA 



1521 


CTCGGGTACG TGTTCCGGAC ATGGGCTCTT CTCCGGTAGC 
GAGCCCATGC ACAAGGCCTG TACCCGAGAA GAGGCCATCG 


GGCGGAGCTT 
CCGCCTCGAA 


CCACATCCGA 
GGTGTAGGCT 


GCCCTGGTCC 
CGGGACCAGG 


CATCCGTCCA 
GTAGGCAGGT 


1601 


GCGGCTCATG GTCGCTCGGC AGCTCCTTGC TCCTAACAGT 
CGCCGAGTAC CAGCGAGCCG TCGAGGAACG AGGATTGTCA 


GGAGGCCAGA 
CCTCCGGTCT 


CTTAGGCACA 
GAATCCGTGT 


GCACAATGCC 
CGTGTTACGG 


CACCACCACC 
GTGGTGGTGG 


1681 


AGTGTGCCGC ACAAGGCCGt 
TCACACGGCG TGTTCCGGCA 


GGCGGTAGGG TATGTGTCTG 
CCGCCATCCC ATACACAGAC 


AAAATGAGCT 
TTTTACTCGA 


CGGAGATTGG 
GCCTCTAACC 


GCTCGCACCT 
CGAGCGTGGA 


GGACGCAGAT 
CCTGCGTCTA 


1761 


GGAAGACTTA AGGCAGCGGC 
CCTTCTGAAT TCCGTCGCCG 


AGAAGAAGAT GCAGGCAGCT 
TCTTCTTCTA CGTCCGTCGA 


GAGTTGTTGT 
CTCAACAACA 


ATTCTGATAA 
TAAGACTATT 


GAGTCAGAGG 
CTCAGTCTCC 


TAACTCCCGT 
ATTGAGGGCA 


1841 


TGCGGTGCTG TTAACGGTGG 
ACGCCACGAC AATTGCCACC 


AGGGCAGTGT AGTCrGAGCA 
TCCCGTCACA TCAGACTCGT 


GTACTCGTTG 
CATGAGCAAC 


CTGCCGCGCG 
GACGGCGCGC 


CGCCACCAGA 
GCGGTGGTCT 


CATAATAGCT 
GTATTATCGA 


+2 










ECORI 


M A A 


1921 


GACAGACTAA CAGACTGTTC 
CTGTCTGATT GTCTGACAAG 


CTTTCCATGG GTCTTTTCTG 
CAAAGGTACC CAGAAAAGAC 


CAGTCACCGT 
GTCAGTGGCA 


CGTCGACCTA 
GCAGCTGGAT 


AGAATTCACC 
TCTTAAGTGG 


ATGGCTGCAT 
TACCGACGTA 


-►2 
2001 


YAAQ G^K VLVL HPS 
ATGCAGCTCA GGGCTATAAG GTGCTAGTAC TCAACCCCTC 
TACGTC6AGT CCCGATATTC CACGATCATG AGTTGGGGAG 


VAA TLGF GAY 
TGTTGCTGCA ACACTGGGCT TTGGTGCrTA 
ACAACGACGT TGTGACCCGA AACCACGAAT 


M S K 
CATGTCCAAG 
GTACAGGTTC 



+ 2AHGI DPN rRT GVRT ITT GSP I T Y S TYG 
2081 GCTCATGGGA TCGATCCTAA CATCAGGACC GGGGTGAGAA CAATTACCAC TGGCAGCCCC ATCACGTACT CCACCTAC3G 
CGAGTACCCT AGCTAGGATT GTAGTCCTGG CCCCACTCTT GTTAATGGTG ACCGTCGGGG TAGTGCATGA GGTGGATGCC 



+ 2 KTL ADGG CSG GAY Dili CDE CHS IDA 
2161 CAAGTTCCTT GCCGACGGCC GGTGCTCGGG GGGCGCTTAT GACATAATAA TTTGTGACGA GTGCCACTCC ACGGATGCCA 
GTTCAAGGAA CGGCTGCCGC CCACGAGCCC CCCGCGAATA CTGTATTATT AAACACTGCT CACGGTGAGG TGCCTACGGT 



+2TSIL GIG TVLD QAE TAG ARLV VLA TAT 
2241 CATCCATCTT GGGCATTGGC ACTGTCCTTG ACCAAGCAGA GACTGCGGGG GCGAGACTGG TTGTGCTCGC CACCGCCACC 
GTAGGTAGAA CCCGTAACCG TGACAGGAAC TGGTTCGTCT CTGACGCCCC CGCTCTGACC AACACGAGCG GTGGCGGTGG 



♦2PPGS VTV PHP NIEE VA L STT GEIP FYG 
2321 CCTCCGGGCT CCGTCACTGT GCCCCATCCC AACATCGAGG AGGTTGCTCT GTCCACCACC GGAGAGATCC CTTTTTACGG 
GGAGGCCCGA GGCAGTGACA CG6GGTAGGG TTGTAGCTCC TCCAACGAGA CA6GTGGTGG CCTCTCTAGG GAAAAATGCC 



^2 KAI PLEV IKG GRH LIFC HSK KKC DEL 
2401 CAAGGCTATC CCCCTCGAAG TAATCAAGGG GGGGAGACAt CTCATCTTCT GTCATTCAAA GAAGAAGTGC GACGAACICG 
GTTCCGATAG GGGGAGCTTC ATTAGTTCCC CCCCTCTGTA GAGTAGAAGA CAGTAAGTTT CTTGTTCACG CTGCTTGAGC 



+2AAKL VAL GINA VAY YRG LDVS VIP TSG 
2481 CCGCAAAGCT GGTCGCATTG GGCATCAATG CCGTCGCCTA CTACCGCGGT CTTGACGTGT CCGTCATCCC GACCAGCGGC 
GGCGTTTCGA CCAGCCTAAC CCGTACTTAC GGCACCGGAT GATGGCGCCA GAACTGCACA GGCAGTAGGG CTGGTCGCCG 



+2DVVV VAT DA L MTGY TGD FDS VtDC NTC 
2561 GATGTTGTCG TCGTGGCAAC CGATGCCCTC ATGACCGGCT ATACCGGCGA CTTCGACTCC GTGATAGACT. GCAATACGTG 
CTACAACAGC AGCACCGTTG GCTACGGGAG TACTGGCCGA TATGGCCGCT GAAGCTGAGC CACTATCTGA CGTTATGCAC 
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+2 V TQ TVDF SLD PTF TIET ITL PQD AVS 
2641 . TGTCACCCAG ACAGTCGATT TCAGCCTTGA CCCTACCTTC ACCATTGAGA CAATCACGCT CCCCCAAGAT GCTGTCTCCC 
ACAGTGGGTC TGTCAGCTAA AGTCGGAACT GGGATGGAAG TGGTAACTCT GTTAGTGCGA GGGGGTTCTA CGACAGAGGG 



♦2RTQR-RGR TGRG KPG I X R FVAP GER PSG 
2*721 GCACTCAACG TCGGGGCAGG ACTGGCAGGG GGAAGCCAGG CATCTACAGA TTTGTGGCAC CGGGGGAGCG CCCCTCCGGC 
CGTGAGTTGC AGCCCCGTCC TGACCGTCCC CCTTCGGTCC GTAGATGTCT AAACACCGTG GCCCCCTCGC GGGGAGGCCG 



+ 2MrOS SVL CEC YDAG CAW YEL TPAE TTV 
2801 ATGT7CGACT CGTCCGTCCT CTGTGAGTGC TATGACGCAG GCTGTGCTTG GTATGAGCTC ACGCCCGCCG AGACTACAGT 
TACAAGCTGA GCAGGCAGGA GACACTCACG ATACTGCGTC CGACACGAAC CATACTCGAG TGCGGGCGGC TCTGATGTCA 



+2 RLR AYMN TPG LPV CQDH .LEF WEG VFT 

StuI 



2881 TAGGCTACGA GCGTACATGA ACACCCCGGG GCTTCCCGTG TGCCAGGACC ATCTTGAATT TTGGGAGGGC GTCTTTACAG 
AtCCGATGCT CGCATGTACT TGTGGGGCCC CGAAGGGCAC ACGGTCCTGG TAGAACTTAA AACCCTCCCG CAGAAATGTC 



+2 


GLTH IDA HFLS QTK QSG EMLP YLV AYQ 

Stur 


2961 


— — « ^« FPi^^^j^ i^««>wM«nfitr*>i>«.<ii ^/^^A/^Jk/^&aa fif*h./l^tlTtZfZfZ flUdAACCTTC PTTACCTC3GT &GCGTACCAA 

GCCTCACTCA TATAGATGCC CACTTTCTAT CCCAv»AtAftA ijt.A«i»wivawj uavjaa^v.! v.i inv«i^i.uuA nu\«v3 x nv>\-rvt 

CGGAGTGAGT ATATCTACGG GTGAAAGATA GGGTCTGTTT CGTCTCACCC CTCTTGGAAG GAATGGACCA TCGCATGGTT 


+2 
3041 


ATVC ARA QAP PPSW DQM W KC LIRL KPT 

^^^m.^^^tm^m ^/■i-**!!**^^/*/* fr^ji.Tinnnnn'p rrrrTKTrtZT fin,flACClACiA*T GTGGAAGTGT TTGATTCGCC TCAAGCCCAC 
GCCACCGTGT GCGCTJWGGC TCAAGCCCCX L.i.*UuwAiUVji i3iji»«\-w\ij/*A ^3^^awv^^JLv^. * awws^w *. w«t*^w>^ww.iw 

CGGTGGCACA CGCGATCCCG AGTTCGGGGA GGGGGTAGCA CCCTGGTCTA CACCTTCACA AACTAAGCGG AGTTCGGGTG 


+2 
3121 


LHG P TPL LYR LGA VQNE ITL THP VTK 

^f^iL^j-^mr*^!"*^ ir>f*r>m*^mKr>K/^ KrTfzrezT'nr'P flTTCAdAATG AAATCACCCT GACGCACCCA GTCACCAAAT 
CCTCCATGGG CCAACnCCCC TGCTATACAG ACTWjW-W-i lai iv.i\i»MiMvi /uu^xv^p^j^v^.^' x \jnv,\jwi-v«-v.v,<-k w 

GGAGGTACCC GGTTGTGGGG ACGATATGTC TGACCCGCGA CAAGTCTTAC TTTAGTGGGA CTGCGTGGGT CAGTGGTTTA 


+2 
3201 


YIMT CMS ADLE VVT STW YLVG GVL AAL 
ACATCATGAC ATGCATGTCG GCCGACCTGG AGGTCGTCAC GAGCACCTGG GtGCTCGTTG GCGGCGTCCT GGCTGCTTTG 
TGTAGTACTG TACGTACAGC CGGCTGGACC TCCAGCAGTG CTCGTGGACC CACGAGCAAC CGCCGCAGGA CCGACGAAAC 


+2 
3281 


AAYC LST GCV VIVG RVV LSG KPAI IPD 
GCCGCGTATT GCCTGTCAAC AGGCTGCGTG GTCATAGTGG GCAGGGTCGT CTTGTCCGGG AAGCCGGCAA TCATACCTGA 
CGGCGCATAA CGGACAGTTG TCCGACGCAC CAGTATCACC CGTCCCAGCA GAACAGGCCC TTCGGCCGTT AGTATGGACT 


+2 
3361 


REV LYRE FOE MEE CSQH LPY I Z Q GMM 

cagggaagtc ctctaccgag agttcgatga gatggaagag tgctctcagc acttaccgta catcgagcaa gggatgatgc 
ct^c^^Sg gagatggctc tcaagctact ctaccttctc acgagagtcg tgaatggcat gtagctcgtt ccctactacg 


>2 
3441 


LAEO FKQ KALG LLQ TAS RQAE VIA PAV 
TCGCCGAGCA GTTCAAGCAG AAGGCCCTCG GCCTCCTGCA GACCGCGTCC CGTCAGGCAG AGGTTATCGC CCCTGCTGTC 

^«G??^ ?tcEgggagc cggaggacgt ctggcgcagg gcagtccgtc tccaatagcg gggacgacag 


♦2 

3521 


OTMII QKL BTF WAKH MWN FI S GIQY LAG 
CAGACCAACT GGCAAAAACT CGAGACCTTC TGGGCGAAGC ATATGTGGAA CTTCATCAGT GGGATACAAT ACTTGGCGGG 
G?^iGTTGA ccSttttga GCTCTGGAAG ACCCGCTTCG TATACACCTT GAAGTAGTCA CCCTATGTTA TGAACCGCCC 


+2 
3601 


T<?T LPGN PAI ASL MAFT AAV TSP ^TT 
CTTGTCAACG CTGCCTGGTA ACCCCGCCAT TGCTTCATTG ATGGCTTTTA CAGCTGCTGT CACCAGCCCA CTAACCACTA 

gIISgc SScat TGGGGCGGTA acgaagtaac taccgaaaat gtcgacgaca gtggtcgggt gattggtgat 


^2 
3681 


eoTT TTN ILGG MVA AQL ARPG ART AfV 
GCCAAACCCT CCTCTTCAAC ATATTGGGGG GGTGGGTGGC TGCCCAGCTC CCCGCCCCCG GTGCCGCTAC ^GCCTTTGTQ 
CGGTTTGGGA aOAGAAGITG lATAACCCCC CCACCCACCG ACCGGTCGAG CGGCGGGGGC CACGGCGATG ACGGAAACAC 
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*2 G k C L AGA AIG SVGL GKV LID ILAG YGA 
3761 GGCGCTGGCT TAGCTGGCGC CGCCATCGGC AGTGTTGGAC TGGGGAAGGT CCTCATAGAC ATCCTTGCAG GGTATGGCGC 
CCGCGACCGA ATCGACCGCG GCGGTAGCCG TCACAACCTG ACCCCTTCCA GGAGTATCTG TAGGAACGTC CCATACCGCG 

*.2 GVA G ALV ATK IMS GEVP STE DLV NLL 
3841 GGGCGTGGCG GGAGCTCTTG TGGCATTCAA GATCATGAGC GGTGAGGTCC CCTCCACGGA GGACCTGGTC AATCTACTGC 
CCCGCACCGC CCTCGAGAAC ACCGTAAGTT CTAGTACTCG CCACTCCAGG GGAGGTGCCT CCTGGACCAG TTAGATGACG 



t2PAIL SPG ALVV GVV CAA ILRR HVG PGE 
3921 CCGCCATCCT CTCGCCCGGA GCCCTCGTAG TCGGCGTGGT CTGTGCAGCA ATACTGCGCC GGCACGTTGG CCCGGGCGAG 
GGCGGTAGGA GAGCGGGCCT CGGGAGCATC AGCCGCACCA GACACGTCGT TATGACGCGG CCGTGCAACC GGGCCCGCTC 



^•2GAVQ WMN RLI AFAS RGN HVS PTHY VPS 
4001 GGGGCAGTGC AGTGGATGAA CCGGCTGATA GCCTtCGCCT CCCGGGGGAA CCATGTTTCC CCCACGCACT ACGTGCCGGA 
CCCCGTCACG TCACCTACTT GGCCGACTAT CGGAAGCGGA GGGCCCCCTT GGTACAAAGG GGGTGCGTGA TGCACGGCCT 

+2 SOA AARV TAI LSS LTVT QLL RRL HQW 
4081 GAGCGATGCA GCTGCCCGCG TCACTGCCAT ACTCAGCAGC CTCACTGTAA CCCAGCTCCT GAGGCGACTG CACCAGTGGA 
CTCGCTACGT CGACGGGCGC AGTGACGGTA TGAGTCGTCG GAGTGACATT GGGTCGAGGA CTCCGCTGAC GTGGTCACCT 

+2ISSE CTT PCSG SWL RDI WDWI CEV LSD 
4161 TAAGCTCGGA GTGTACCACT CCATGCTCCG GTTCCTGGCT AAGGGACATC TGGGACTGGA TATGCGAGGT GTTGAGCGAC 
ATTCGAGCCT CACATGGTGA GGTACGAGGC CAAGGACCGA TTCCCTGTAG ACCCTGACCT ATACGCTCCA CAACTCGCTG 



*-2rKTW LKA KLM PQLP GIP FVS CQRG YKG 

BamHI 



4241 TTTAAGACCT GGCTAAAAGC TAAGCTCATG CCACAGCTGC CTGGGATCCC CTTTGTGTCC TGCCAGCCCG GGTATAAGGG 
AAATtCTGGA CCGATTTTCG ATTCGAGTAC GGTGTCGACG GACCCTAGGG GAAACACAGG ACGGTCGCGC CCATATTCCC 



^.Z VWR GDGI MHT RCH CGAE ITG HVK NGT 
4321 GGTCTGGCGA GGGGACGGCA TCATGCACAC TCGCTGCCAC TGTGGAGCTG AGATCACTGG ACATGTCAAA AACGGGACGA 
CCAGACCGCT CCCCTGCCGT AGTACGTGTG AGCGACGGTG ACACCTCGAC TCTAGTGACC TGTACAGTTT TTGCCCTGCT 

+ 2MRrV GPR TCRN MWS GTF PIMA YTT G P C 
4401 TGAGGATCGT CGGTCCTAGG ACCTGCAGGA ACATGTCGAG TGGGACCITC CCCATTAATG CCTACACCAC GGGCCCwTGT 
ACTCCTAGCA GCCAG6ATCC TGGACGTCCT TGTACACCTC ACCCTGGAAG GGGTAATTAC GGATGTGGTG CCCGGGGACA 



^-ZTPLP APN YTF ALMR VSA EEY VEIR QVG 
4481 ACCCCCCrrC CTGCGCCGAA CTACACGTTC GCGCTATGGA GGGTGTCTGC AGAGGAATAC GTGGAGATAA GGCAGGTGGG 
TGGGGGGAAG GACGCGGCTT GATGTGCAAG CGCGATACCT CCCACACACG TCTCCTTATG CACCTCTATT CCGTCCACCv, 



+ 2 DFH YVTG MTT ONL KCPC QVP SPE FFT 
4561 GGACTTCCAC TACGTGACGG GTATGACTAC TGACAATCTT AAATGCCCGT GCCAGGTCCC ATCGCCCGAA ^TTTTCACAG 
CCTGAAGGTG ATGCACTGCC CATACTGATG ACTGTTAGAA TTTACGGGCA CGGTCCAGGG TAGCGGGCTT AAAAAGTGxC 



^-2ELDG VRL HRFA PPC KPL LREE VSF RVG 
4 641 AATTGGACGG GGTGCGCCTA CATAGGTTTG CGCCCCCCTG CAAGCCCTTG CTCCGGGAGG AGGTATCATT CAGAGTAGGA 
TTAACCTGCC CCACGCGGAT GTATCCAAAC GCGGGGGGAC GTTCGGGAAC GACGCCCTCC TCCATAGTAA GTCTCATCCT 



+ 2LHEY PVG SQL PCEP BPD VAV LTSM LTD 
4721 CTCCACGAAT ACCCGGTAGG GTCGCAATTA CCTTCCGAGC CCGAACCGGA CGTGGCCGTG TTGACGTCCA TGCTCACTGA 
GAGGTGCTTA TGGGCCATCC CAGCGTTAAT GGAACGCTCG GGCTTGGCCT GCACCGGCAC AACTGCAGGT ACGAGTGACT 

+2 PSH ITAE AAG RRL AR'GS PPS VAS SSA 
4801 TCCCTCCCAT ATAACAGCAG AGGCGGCCGG QCGAAGGTTG GCGAGGGGAT CACCCCCCTC TGTGGCCAGC TCCTCGGCTA 
AGGGA^GTA TATTGTCGTC TCCGCCGGCC CGCTTCCAAC CGCTCCCCTA GTGGGGGGAG ACACCGGTCG AGGAGCCGAT 
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^2SQLS AP S LKAT CTA NHO SPOA ELl EAN 
4881 GCCAGCTATC CCCTCCATCT CTCAAGGCAA CTTCCACCGC TAACCATGAC TCCCCTC»TG CTGAGCTCAT AGAGGCCAAC 
CGGTCGATAG GCGAGGTAGA GAGTTCCGTT GAACGTGGCG ATTGGTACTG ACGGGACTAC GACTCGAGTA TCTCCGGTTG 



CGN ITRV ESE WKV VILD SFO 
4961 CTCCTATGGA GGCAGGAGAT GGGCCGCAAC ATCACCAGGG TTGAGTCAGA AAACAAAGTG GTGATTCTGG ACTCCTTCGA 
GAGGATACCT CCGTCCTCTA CCCGCCGTTG TAGTGGTCCC AACTCAGTCT TTTGTTTCAC CACTAAGACC TGAGGAAGCT 



+2 PLV AGED ERE ISV PAEl LRK SRR FAQ 
5041 TCCGCTTGTG GCGGAGGAGG ACGAGCGGGA GATCTCCGTA CCCGCAGAAA TCCTGCGGAA GTCTCGGAGA TTCGCCCAGG 
AGGCGAACAC CGCCTCCTCC TGCTCGCCCT CTAGAQGCAT GGGCGTCTTT AGGACGCCTT CAGAGCCTCT AAGCGGGTCC 



+2ALPV.WAR POVM PPL VET W- KKP DYE PPV 
5121 CCCTGCCCGT TTGGGCGCGG CCGGACTATA ACCCCCCCCT AGTGGAGACG TGGAAAAAGC CCGACTACGA ACCACCTGTG 
GGGACGGGCA AACCCGCGCC GGCCTGATAT TGGGGGGCGA TCACCTCTGC ACCTTTTTCG GGCTGATGCT TGGTGGACAC 



+ 2VHGC PLP PPK SPPV PPP RKK RTVV LTE 
5201 GTCCATGGCT GCCCGCTTCC ACCTCCAAAG TCCCCTCCTG TGCCTCCGCC TCGGAAGAAG CGGACGGTGG TCC7CACTGA 
CAGGTACCGA CGGGCGAAGG TGGAGGTTTC AGGGGACGAC ACGGAGGCGG AGCCTTCTTC GCCTGCCACC AGGAGTGACT 



+ 2 STL STAL AEL ATR SFGS SST SGI TGD 
5281 ATCAACCCTA TCTACTGCCT TGGCCGAGCT CGCCACCAGA AGCTTTGGCA GCTCCTCAAC TTCCGGCATT ACCGGCGACA 
TAGTTGGGAT AGATGACGGA ACCGGCTCGA GCGGTGGTCT TCGAAACCCT CGAGGAGTTC AAGGCCGTAA TGCCCGCTGT 



i'2NTTT SSE PAPS GCP POS DAES YSS MPP 
5361 ATACGACAAC ATCCTCTGAG CCCGCCCCTT CTGGCTGCCC CCCCGACTCC GACGCTGAGT CCTATTCCTC CATGCCCCCC 
TATGCTGTTG TAGGAGACTC GGGCGGGGAA GACCGAC6GG GGGGCTGAG6 CTGCGACTCA GGATAAGGAG GTACGGGGGG 



+2 LEGE PGD PDL SOGS WST VSS EANA EDV 
BamHl 



5441 CTGGAGGGGG AGCCTGGGGA TCCGGATCTT AGCGACGGGT CATGGTCAAC GGTCAGTRGT GAGGCCAACG CGG AGGATGT 
GACCTCCCCC TCGGACCCCT AGGCCTAGAA TCGCTGCCCA GTACCAGTTG CCAGTCATCA CTCCGGTTGC GCCTCCTACA 



>2 VCC SMSY SWT GAL VTPC AAE EQK LP! 
5521 CGTGTGCTGC TCAATGTCTT ACTCTTGGAC AGGCGCACTC GTCACCCCGT GCGCCGCGGA AGAACAGAAA CTGCCCATCA 
GCACACGACG AGTTACAGAA TGAGAACCTG TCCGCGTGAG CAGTGGGGCA CGCGGCGCCT TCTTGTCTTT GACGGGTAGT 



♦ 2NALS NSL LRHH NLV YST TSRS ACQ RQ K 
5601 ATGCACTAAG CAACTCGTTG CTACGTCACC ACAATTTGGT GTATTCCACC ACCTCACGCA GTGCTTGCCA AAGGCAGAAG 
TACCTGATTC GTTGAGCAAC GATGCAGTGG TGTTAAACCA CATAAGGTGG TGGAGTCCCT CACCAACGCT TTCCGTCTTC 



♦2KVTr ORL QVL DSHY QOV LKE VKAA ASK 
5691 AAAGTCACAT TTGACAGACT GCAAGTTCTG GACAGCCATT ACCAGGACGT ACTCAAGGAG GTTAAAGCAG CGGCGTCAAA 
TTTCAGTGTA AACTGTCTGA CGTTCAAGAC CTGTCGGTAA TGGTCCTGCA TGAGTTCCTC CAATTTCGTC GCCGCAGTT7 



+2 VKA NLLS VEE ACS LTPP HSA KSK FGY 
5761 AGTGAAGGCT AACTTGCTAT CCCTAGAGGA AGCTTGCAGC CTGACGCCCC CACACTCAGC CAAATCCAAG TTTGGTTATG 
rCACTTCCGA TTGAACGATA GGCATCTCCT TCGAACGTCG GACTGCGGGG GTGTGAGTOG GTTTAGGTTC AAACCAATAC 



't-2GAKD VRC HARK AVT KIN SVWK DLL EDN 
S841 GGGCAAAAGA CGTCCGTTGC CATGCCAGAA AGGCCGTAAC CCACATCAAC TCCGTGTGGA AAGACCTTCT GGAAGACAAT 
CCCGTTTTCT GCAGGCAACG GTACGGTCTT TCCGGCATTG GGTGTAGTTG AGGCACACCT TTCTGGAAGA CCTTCTGTTA 



♦ 2VTPI DTT IMA KN EV FCV QPE KGGR KPA 
5921 GTAACACCAA TAGACACTAC CATCATGGCT AAGAACGAGG TTTTCTGCGT TCAGCCTGAG AAGGGGGGTC GTAAGCCAGC 
CATTGTGGTT ATCTGTGATG GTAGTACCGA TTCTTGCTCC AAAAGACGCA AGtCGGACTC TTCCCCCCAG CATTCGGTCG 
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42 Rtr VFPD LGV RVC EKMA LYO VVT KLP 
6001 TCGTCTCATC GTGTTCCCCG ATCTCGGCGT GCGCGTGTGC GAAAAGATGG CTTTGTACGA CGTGGTTACA AAGCTCCCCT 
AGCAGAGTAG CACAAGGGGC TAGACCCCCA CCCGCACACG CTTTTCTACC GAAACATGCT GCACCAATGT TTCGAGGGGA 



*2LAVM GSS YGFQ YSP GQR VEFL VQA WKS 

EcoRI 



6081 TGGCCGTGAT GGGAAGCTCC TACGGATTCC AATACTCACC AGGACAGCGG GTTGAATTCC TCGTGCAAGC GTGGAAGTCC 
ACCGGCACTA CCCTTCGAGG ATGCCTAAGG TTATGAGTGG TCCTGTCGCC CAACTTAAGG AGCACGTTCG CACCTTCAGG 

f2KKTP MGF SYD TRCF DST VTE SDIR TEE 
6161 AACAAAACCC CAATGGGGTT CTCGTATGAT ACCCGCTGCT TTGACTCCAC AGTCACTGAG AGCGACATCC GTACGGAGGA 
TTCTTTTGGG GTTACCCCAA GAGCATACTA TGGGCGACGA AACTGAGGTG TCACTGACTC TCGCTGTAGG CATGCCTCCT 

^2 AIY QCCD LDP QAR VAIK SLT ERL YVG 
6241 GGCAATCTAC CAATGTTGTG ACCTCGACCC CCAAGCCCGC GTGGCCATCA AGTCCCTCAC CGAGAGGCTT TATGTTGGGG 
CCGTTAGATG GTTACAACAC TGGAGCTGGG GGTTCGGGCG CACCGGTAGT TCAGGGAGTG GCTCTCCGAA ATACAACCCC 



+2GPLT NSR GENC GYR RCR ASGV LTT SCG 
6321 GCCCTCTTAC CAATTCAAGG GGGGAGAACT GCGGCTATCG CAGGTGCCGC GCGAGCGGCG TACTGACAAC TAGCTGTGGT 
CGGGAGAATG GTTAAGTTCC CCCCTCTTGA CGCCGATAGC GTCCACGGCG CGCTCGCCGC ATGACTGTTG ATCGACACCA 

^2NrLT CYI KAR AACR AAG LQD CTML VCG 
6401 AACACCCICA CTTGCTACAT CAAGGCCCGG GCAGCCTGTC GAGCCGCAGG GCTCCAGGAC TGCACCATGC TCGTGTGTGG 
TTGTGGGAGT GAACGATGTA GTTCCGGGCC CGTCGGACAG CTCGGCGTCC CGAGGTCCTG ACGTGGTACG AGCACACACC 

^2 DDL VVIC ESA GVQ EOAA SLR AFT EAM 
6481 CGACGACTTA GTCGTTATCT GTGAAAGCGC GGGGGTCCAG GAGGACGCGG CGAGCCTGAG AGCCTTCACG GAGGCTATGA 
GCTKTGAAT CAGCAATAGA CACTTTCGCG C CCCCAGGTC CTCCTGCGCC GCTCGGACTC TCGGAAGTGC CTCCGATACT 

+2TRYS APP GDPP QPE VOL ELIT SCS SN V 
6561 CCAGGTACTC CGCCCCCCCr GGGGACCCCC CACAACCAGA ATACGACTTG GAGCTCATAA CATCATGCTC CTCCAACGTG 
GGTCCATGAG GCGGGGGGGA CCCCTGGGGG GTGTTGGTCT TATGCTGAAC CTCGAGTATT GTAGTACGAG GAGGTTGCAC 

+2SVAH DGA GKR VYYL TRD PTT PLAR AAW 
6641 TCAGTCGCCC ACGACGGCGC TGGAAAGAGG GTCTACTACC TCACCCGTGA CCCTACAACC CCCCTCGCGA GAGCTGCGTG 
AGTCAGCGGG TGCTGCCGOS ACCTTTCTCC CAGATGATGG AGTGGGCACT GGGATGTTGG GGGGAGCGCT CTCGACGCAC 

f2 ETA RHTP VNS MLG NIIM FAP TLW ARM 
6721 GGAGACAGCA AGACACACTC CAGTCAATTC CTGGCTACGC AACATAATCA TGTTTGCCCC CACACTGTGG GCGAGGATGA 
CCTCTGTCGT TCTGTGTGAG GTCAGTTAAC GACCGATCCG TTCTATTAGT ACAAACGGGG GTGTGACACC CGCTCCTACT 

*2rLMT HFF SVLI ARD QLE QALD CEI YGA 
6801 TACTGATGAC CCATTTCTTT AGCGTCCTTA TAGCC»GGGA CCAGCTTGAA CAGGCCCTCG ATTGCGAGAT ^TACGGGGCC 
ATGACTACTG GGTAAAGAAA TCGCAGGAAT ATCGGTCCCT GGTCGAACTT GTCCGGG AGC TAACGCTCTA GATGCCCCGG 

.yrvsi EPL Dtp PIIQ RLH GLS AFSL HSY 
fiaai TGCTACTCCA TAGAACCACT GGATCTACCT CCAATCATTC AAAGACTCCA TGGCCTCAGC GCATTTTCAC TCCACAGTTA 

a^It^^ AT^GGTGA cctagatgga ggtt agtaag tttctgaggt ACCGGAGTCG CGTAAAAGTG aggtgtcaat 

*9 epG EINR VAA CLR KLGV PPL RAW RHR 
6961 CTCTCCAGGT GAAATCAATA GGGTGGCCGC ATGCCTCAGA AAACTTGGGG TACCGCCCTT GCGAGCTTGG AGACACCGGG 
SgIg^A C^IgTTAT CCcS sCG TACGGAGTCT TTTGAACCCe ATGGCGGGAA CGCTCGAACC TCtGTGGCCC 

704l^ CCCGGA^CC? CcScG^AGG OTCTGgScA^'gAG^AG^AG GG^TG^CATA I^J^|s!^t'^^^^ gI^^TCA? 
GGGCCTCGCA GGCGCGATCC GAAGACCGGT ctcctccgtc CCGACGGTAT ACACCGTTCA TGGAGAAGTT GACCCGTCAT 
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♦2RTKL KLT PIA AAGQ LOL SG« FTAG YSG 
7121 AGAACAAAGC TCAAACTCAC TCCAATAGCG GCCGCTGQCC AGCTGGACTT GTCCGGCTGG TTCACGGCTG GCTACAGCGG 
TCTTGTTTCG AGTTTGAGTG AGGTTATCGC CGGCGACOGG TCGACCTGAA CAGGCCGACC AAGTGCCCAC CGATGTCGCC 



+2 GDI Y HSV SHA RPR fflHF CL L LLA AGV 
7201 GGGAGACATT TATCACAGCG TGTCTCATGC CCGGCCCCGC TGGATCTGGT TTTGCCTACT CCTGCTTGCT GCAGGGGTAG 
CCCTCTGTAA ATAGTGTCGC ACAGAGTACG GGCCGGGGCG ACCTAGACCA AAACGGATGA GGACGAACGA CGTCCCCATC 



+2GIYL LPN R 
7281 GCATCTACCT CCTCCCCAAC CGATGAAGGT TGGGGTAAAC ACTCCGGCCT AAAAAAAAAA AAAAATCTAG AAAGGCGCGC 
CGTAGATGGA GGAGGGGTTG GCTACTTCCA ACCCCATTTG TGAGGCCGGA tTTTTTTTTT TTTTTAGATC TTTCCGCGCG 

BanKI HXuI 



7361 CAAGATATCA AGGATCCACT ACGCGTTAGA GCTCGCTGAT CAGCCTCGAC TGTGCCTTCT AGTTGCCAGC CATCTGTTGT 
GTTCTATAGT TCCTAGGTGA TGCGCAATCT CGAGCGACTA GTCGGAGCTG ACACGGAAGA TCAACGGTCG GTAGACAACA 

7441 TTGCCCCTCC CCCGT6CCTT CCTTGACCCT GGAAGGIGCC ACTCCCACTG TCCTTTCCTA ATAAAATGAG GAAATTGCAT 

aacggggagg gggcacggaa ggaactggga ccttccacgg tgagggtgac aggaaaggat tattttactc ctttaacgta 
7521 cgcattgtct gagtaggtgt cattctattc tggggggtgg ggtggggcag gacagcaagg gggaggaxtg ggaagacaat 

GCGTAACAGA CTCATCCACA GTAAGATAAG ACCCCCCACC CCACCCCGTC CTGTCGTTCC CCCTCCTAAC CCTTCTGTTA 



7601 AGCA6GCATG CTGGGGAGCT CTTCCGCTTC CTCGCTCACT GACTCGCTGC GCTCGGTCGT TCGGCTGCGG CGAGCGGTAT 
TCGTCCGTAC GACCCCTCGA GAAGGCGAAG GAGCGAGTGA CTGAGCGACG CGAGCCAGCA AGCCGACGCC GCTCGCCATA 

7631 CAGCTCACTC AAAGGCGGTA ATACGGTTAT CCACAGAATC A6GGGATAAC GCAGGAAAGA ACATGTGAGC AAAAGGCCAG 
GTCGAGTGAG TTTCCGCCAT TATGCCAATA GGTGTCTTAG TCCCCTATTG CGTCCTTTCT TGTACACTCG TTTTCCGGTC 



7761 CAAAAQGCCA GGAACCGTAA AAAGGCCGCG TTGCPGGCGT TTTTCCATAG GCTCCGCCCC CCTGACGAGC ATGACAAAAA 
GTTTTCCGGT CCTTGGCATT TTTCCCGCGC AACGACCCCA AAAACGTATC CGAGGCGGGG GGACTGCTCG TAGTGTTTTT 



7841 TCGACGCTCA AGTCAGAGGT GGCGAAACCC GACAGGACTA TAAAGATACC AGGCGTTTCC CCCTGGAAGC TCCCTCGTGC 
AGCTGCGAGT TCAGTCTCCA CCGCTTTGGG CTGTCCTGAT ATTTCTATGG TCC GCAAAGG GGGACCTTCG AGGGAGCACx. 

7921 GCTCTCCTGT TCCGACCCTG CCGCTTACCG GATACCTGTC CGCCTTTCTC CCTTCGGGAA GCGTGGCGCT TTCTCAATGC 

cSgIca aggctgggac gScgaatggc ctatgga cag gcggaaagag ggaagccctt cgcaccgcga aagagttacg 

ftOOl TCACGCTGTA GGTATCTCAG TTCGGTGTAG GTCGTTCGCT CCAAGCTGGG CTGTGTGCAC GAACCCCCCG TTCAGCCCGA 

Ig?^acat c^^agagtc aagcc aStc cagcaagcga cgttcgaccc gacacacgtg cttggggggc aagtcgggct 

8081 ccgctgcgcc ttatccggta actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca gcagccactg 
ggcgacgcgg aataggccat tgatagcaga actcaggttg ggccattctg tgctgaatag cggtgaccgt cgtcgtgac 

filfil gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg aagtggtggc ctaactacgg ctacactaga 

^A^GTC^I ItC^CTCGC TCC ATAgATC CGCCACGATG TCTCAAGAAC TTCACCACCG GATTGATGCC GATGTGATCT 

8241 AGGACAGTAT TTGGTATCTG CGCTCTGCTG AAGCCAGTTA CCTTCGGAAA AAGAGTTGGT AGCTCTTGAT CCGGCAAACA 
TCCTGTCATA AACCATAGAC GCGAGACGAC TTCGGTCAAT GGAAGC CTTT TTCTCAACCA TCGAGAACTA GGCCGTTTGT 

ft-i^t AACCACCGCT GGTAGCGGTG GTTTTTTTGT TTGCAAGCAG CAGATTACGC GCAGAAAAAA AGGATCTCAA GAAGATCCTT 
togSJ CCATCGCCAC CAAAAAAACA AACGT TCGTC GTCTAATGCG CGTCTTTTTT TCCTAGAGTT CTTCTAGGAA 

B401 TGATCTTTTC TACGGGGTCT GACGCTCAGT GGAACGAAAA CTCACGTTAA GGGATTTTGG TCATGAGATT' ATCAj"^ 

ac^IgISIIg I?gccccaga CTGCGAGTCA CCTTGCITTT GAGTGCAATT CCCTAAAACC AGTACTCTAA tagtttttcc 
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84BX ATCTTCACCT AGATCCTTTT AAATTAAAAA TGAAGTTTTA AATCAATCTA AAGTATATAT GAGTAAACTT GGTCTGACAG 
. TAGAAGTGGA TCTAGGAAAA TTTAATTTTT ACTTCAAAAT TTAGTTAGAT TTCATATATA CTCATTTGAA CCAGACTGTC 

8561 TTACCAAr'(jC TTAATCAGTG AGGCACCTAT CTCAGCGATC TGTCTATTTC GTTCATCCAT AGTTGCCTGA CTCCCCGTCG 
AATGCTTACG AATTAGTCAC TCCGTGGATA GAGTCGCTAG ACAGATAAAG CAAGTAGGTA TCAACGGACT GAGGGGCAGC 

8641 TGTAGATAAC TACGATACGG GAGGGCTTAC CATCTGGCCC CAGTGCTGCA ATGATACCGC GAGACCCACG CTCACCGGCT 
ACATCTATTG ATGCTATGCC CTCCCGAATG GTAGACCGGG GTCACGACGT TACTATGGCG CTCTGGGTGC GAGTGGCCGA 



8721 CCAGATTTAT CAGCAATAAA CCAGCCAGCC GGAAGGGCCG AGCGCAGAAG TGGTCCTGCA ACTTTATCCG CCTCCATCCA 
GGTCTAAATA GTCGTTATTT GGTCGGTCGG CCTTCCCGGC TCGCGTCTTC ACCAGGACGT TGAAATAGGC GGAGGTAGGT 



8801 GTCTATTAAT TGTTGCCGGG AAGCTAGAGT AAGTAGTTCG CCAGTTAATA GTTTGCGCAA CGTTGTTGCC ATTGCTACAG 
CACATAATTA ACAACGGCCC TTCGATCTCA TTCATCAAGC GGTCAATTAT CAAACGCGTT GCAACAACGG TAACGATGTC 

8881 GCATCGTGGT GTCACGCTCG TCGTTTGGTA TGGCTTCATT CAGCTCCGGT TCCCAACGAT CAAGGCGAGT TACATGATCC 
CGTAGCACCA CAGTGCGAGC AGCAAACCAT ACCGAAGTAA GTCGAGGCCA AGGGTTGCTA GTTCCGCTCA ATGTACTAGG 

8961 CCCATGTTGT GCAAAAAAGC GGTTAGCTCC TTCGGTCCTC CGATCGTTGT CAGAAGTAAG TTGGCCGCAG TGTTATCACT 
GGGTACAACA CGTTTTTTCG CCAATCGAGG AAGCCAGGAG GCTAGCAACA GTCTTCATTC AACCGGCGTC ACAATAGTGA 

9041 CATGGTTATG GCAGCACTGC ATAATTCTCT TACTGTCATG CCATCCGTAA GATGCTTTTC TGTGACTGGT GAGTACTCAA 
CTACCAATAC CGTCGTGACG TATTAAGAGA ATGACAGTAC GGTAGGCATT CTACGAAAAG ACACTGACCA CTCATGAGTT 

9121 CCAAGTCATT CTGAGAATAG TGTATGCGGC GACCGAGTTG CTCTTGCCCG GCGTCAATAC GGGATAATAC CGCGCCACAT 
GGTTCAGTAA GACTCTTATC ACATACGCCG CTGGCTCAAC GAGAACGGGC CGCAGTTATG CCCTATTATG GCGCGGTGTA 

9201 AGCAGAACTT TAAAAGTGCT CATCATtGGA AAACGTTCTT CGGGGCGAAA ACTCTCAAGG ATCTTACCGC TGTTGAGATC 
TCGTCTTGAA ATTTTCACGA GTAGTAACCT TTTGCAAGAA GCCCCGCTTT TGAGAGTTCC TAGAATGGCG ACAACTCTAG 

9281 CAGTTCGATG TAACCCACTC GTGCACCCAA CTGATCTTCA GCATCTTTTA CTTTCACCAG CGTTTCTGGG TGAGCAAAAA 
GTCAAGCTAC ATTGGGTGAG CACGTGGGTT GACTAGAAGT CGTAGAAAAT GAAAGTGGTC GCAAAGACCC ACTCGTTTT^ 

9361 CAGGAAGGCA AAATGCCGCA AAAAAGGGAA TAAGGGCGAC ACGGAAATGT TGAATACTCA TACTCTT^T "TTCAATAT 
GTCCTTCCGT TTTACGGCGT TTTTTCCCTT ATTCCCGCTG TGCCTTTACA ACTTATGAGT ATGAGAAGGA AAAAGTTA.A 

9441 TATTGAAGCA TTTATCAGGG TTATTGTCTC ATGAGCGGAT ACATATTTGA ATGTATTTAG AAAAAT^ AAATAGGGGT 
ATAACTTCGT AAATAGTCCC AATAACAGAG TACTCGCCTA TGTATAAACT TA CATAAATC TTTTTATTTC TTTATCCCCA 

9521 TCCGCGCACA TTTCCCCGAA AAGTGCCACC TGACGTCTAA GAAACCATTA TTATCATGAC ATTAACCTAT AAAAATAGGC 
AGGCGCGTGT AAAGGGGCTT TTCACGGTOG ACTGCAGATT CTTTGGTAAT AATAGTACTC TAATTGGATA TTTTTATCCG 



9 601 GTATCACGAG GCCCTTTCGT C 
CATAGTGCTC CGGGAAAGCA G 
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I TTGCGCGTTT CGSTGATGAC GGTGAAAACC TCTGACACAT GCAGCTCXCG GAGACGGTCA CAGCTTGTCT GT?UVGCGGAT 
AGCGCGCAAA GCCACTACTG CCACTTTTGG AGACTGTGTA CGTCGAGGGC CTCTGCCAGT GTCGAACAGA CATTCGCCTA 



81 


GCCGGGAGCA 
CGGCCCTCGT 


GACAAGCCCG 
CTGTTCGGGC 


TCAGGGCGCG 
AGTCCCGCGC 


TCAGCGGGTG TTGGCGGGTG TCGGGGCTGG CTTAACTATG 
AGTCGCCCAC AACCGCCCAC AGCCCCGACC GAATTGATAC 


CGGCATCAGA 
GCCGTAGTCT 


161 


GCAGATTGTA 
CGTCTAACAT 


CTGAGAGTGC 
GACTCTCACG 


StuI 

ACCATATGAA GCTTTTTGCA AAAGCCTAGG CCTCCAAAAA 
TGGTATACTT CGAAAAACGT TTTCGGATCC GGAGGTTTTT 


AGCCTCCTCA 
TCGGAGGAGT 


CTACTTCTGG 
GATGAAGACC 


241 


AATAGCTCAG AGGCCGAGGC GGCCTCGGCC TCTGCATAAA TAAAAAAAAT TAGTCAGCCA TGGGGCGGAG AATGGGCGGA 
TTATCGAGTC TCCGGCTCCG CCGGAGCCGG AGACGTATTT ATTTTTTTTA ATpAGTCGGT ACCCCGCCTC TTACCCGCCT 


321 


ACTGGGCGGG 
TGACCCGCCC 


GAGGGAATTA 
CTCCCTTAAT 


ITGGCTATTG 
AACCGATAAC 


GCCATTGCAT ACCTTGTATC 
CGGTAACGTA TGCAACATAG 


TATATCATAA TATGTACATT 
ATATAGTATT ATACATGTAA 


TATATTGGCT 
ATATAACCGA 


401 


CATGTCCAAT 
GTACAGGTTA 


ATGACCGCCA 
TACTGGCGGT 


rCTTGACATT GATTATTGAC TAGTTATTAA 
ACAACTGTAA CTAATAACTG ATCAATAATT 


TAGTAATCAA 
ATCATTAGTT 


TTACGGGGTC 
AATGCCCCAG 


ATTAGTTCAT 
TAATCAAGTA 


481 


AGCCCATATA 
TCGGGTATAT 


TGGAGTTCCG 
ACCTCAAGGC 


CGTTACATAA 
GCAATGTATT 


CTTACGGTAA ATGGCCCGCC 
GAATGCCATT TACCGGGCGG 


TGGCTGACCG 
ACCGACTGGC 


CCCAACGACC 
GGGTTGCTGG 


CCCGCCCATT 
GGGCGGGTAA 


561 


GACGTCAATA 

CTGCAGTTAT 


ATGACGTATG 

TACTGCATAC 


TTCCCATAGT 
AAGGGTATCA 


AACGCCAATA GGGACTTTCC 
TTGCGGTTAT CCCTGAAAGG 


ATTGACGTCA 
TAACTGCAGT 


ATGGGTGGAG 
TACCCACCTC 


TATTTACGGT 
ATAAATGCCA 


641 


AAACTGCCCA 
TTTGACGGGT 


CTTGGCAGTA 
GAACCGTCAT 


CATCAAGTGT 
GTAGTTCACA 


ATCATATGCC AAGTCCGCCC 
TAGTATACGG TTCAGGCGGG 


CCTATTGACG 
GGATAACTGC 


TCAATGACGG 
AGTTACTGCC 


TAAATGGCCC 
ATTTACCGGG 


721 


GCCTGGCATT 
CGGACCGTAA 


ATGCCCAGTA 
TACGGGTCAT 


CATGACCTTA 
GTACTGGAAT 


CGGGACTTTC CTACTTGGCA 
GCCCTGAAAG GATGAACCGT 


GTACATCTAC 

CATGTAGATG 


GTATTAGTCA 

CATAATCAGT 


TCGCTATTAC 
AGCGATAATG 


801 


CATGGTGATG 
GTACCACTAC 


CGGTTTTGGC 
GCCAAAACCG 


AGTACACCAA 
TCATGTGGTT 


TGGGCGTGGA TAGCGGTTTG 
ACCCGCACCT ATCGCCAAAC 


ACTCACGGGG 
TGAGTGCCCC 


ATTTCCAAGT 
TAAAGGTTCA 


CTCCACCCCA 
GAGGTGGGGT 


881 


TTGACGTCAA 
AACTGCAGTT 


TGGGAGTTTG 
ACCCTCAAAC 


TTTTGGCACC 
AAAACCGTGG 


AAAATCAACG GGACTTTCCA AAATGTCGTA 
TTTTAGTTGC CCTGAAAGGT TTTACAGCAT 


ATAACCCCGC 
TATTGGGGCG 


CCCGTTGACG 
GGGCAACTGC 


961 


CAAATGGGCG 
GTTTACCCGC 


GTAGGCGTGT 
CATCCGCACA 


ACGGTGGGAG 
T6CCACCCTC 


GTCTATATAA GCAGAGCTCG 
CAGATATATT CGTCTCGAGC 


TTTAGtGAAC 
AAATCACTTG 


CGTCAGATCG 
GCAGTCTAGC 


CCTGGAGACG 
GGACCTCTGC 


1041 


CCATCCACGC TGTTTTGACC TCCATAGAAG ACACCGGGAC CGATCCAGCC TCCGCGGCCG GGAACGGTGC 
GGTAGGTGCG ACAAAACTGG AGGTATCTTC TGTGGCCCTG GCTAGGTCGG AGGCGCCGGC CCTTGCCACG 


ATTGGAACGC 
TAACCTTGCG 


1121 


GGATTCCCCG 
CCTAAGGGGC 


TGCCAAGAGT 
ACGGTTCTCA 


GACGTAAGTA 
CTGCATTCAT 


CCGCCTATAG ACTCTATAGG CACACCCCTT 
GGCGGATATC TGAGATATCC GTGTGGGGAA 


TGGCTCTTAT 
ACCGAGAATA 


GCATGCTATA 
CGTACGATAT 


1201 


CTGTTTTTGG CTTGGGGCCT ATACACCCCC GCTCCTTATG CTATAGGTGA TGGTATAGCT TAGCCTATAG GTGTGGGTTA 
GACAAAAACC GAACCCCGGA TATGTGGGGG CGAGGAATAC GATATCCACT ACCATATCGA ATCGGATATC CACACCCAAT 


1281 


TTGACCATTA TTGACCACTC CCCTATTGGT GACGATACTT TCCATTACTA ATCCATAACA TGGCTCTTTG CCACAACTAT 
AACTGGTAAT AACTGGTGAG GGGATAACCA CTGCTATGAA AGGTAATGAT TAGGTATTGT ACCGAGAAAC GGTGTTGATA 


1361 


CTCrATTGGC TATATGCCAA TACTCTGTCC TTCAGAGACT GACACGGACT CTGTATTTTT ACAGGATGGG GTCCATTTAT 
GAGATAACCG ATATACGGTT ATGAGACAGG AAGTCTCTGA CTGTGCCTGA GACATAAAAA TGTCCTACCC CAGGtAAATA 
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1441 


TATTIACAAA TTCACATATA CAACAACGCC GTCCCCCGTG CCCGCAGTTT TTATTAAACA TAGCGTGGGA TCTCCGACAT 
ATAAATGTTT AAGTGTATAT GTTGTTGCGG CAGGGGGCAC GGGCGTCAAA AATAATTTGT ATCGCACCCT AGAGGCTGTA 


1521 


CTCGGGTACG TGTTCCGGAC ATGGGCTCTT CTCCGGTAGC GGCGGAGCtT CCACATCCGA GCCCTGGTCC CATCCGTCCA 
GAGCCCATGC ACAAGGCCTG TACCCGAGAA GAGGCCATCG CCGCCTCGAA GGTGTAGGCT CGGGACCAGG GTAGGCAGGT 


1601 


GCGGCTCATG GTCGCTCGGC AGCTCCTTGC TCCTAACAGT GGAGGCCAGA CTTAGGCACA GCACAATGCC CACCACCACC 
CGCCGAGTAC CAGCGAGCCG TCGAGGAACG AGGATTGTCA CCTCCGGTCT GAATCCGTGT CGTGTTACGG GTGGTGGTGG 


1681 


AGTGTGCCGC ACAAGGCCGT GGCGGTAGGG TATGTGTCTG AAAATGAGCT CGGAGATTGG GCTCGCACCr GGACGCAGAT 
Tr^r*zr*rfirr TfZTTrmftr^ rcfirrik.'rrrr ATArArAGAC TTTTACTCGA GCCTCTAACC CGAGCGTGGA CCTGCGTCTA 


1761 


GGAAGACTTA AGGCAGCGGC AGAAGAAGAT GCAGGCAGCT GAGTTGTTGT ATTCTGATAA GAGTCAGAGG TAACTCCCGT 
CCTTCTGAAT TCCGTCGCCG TCTTCTTCTA CGTCCGTCGA CTCAACAACA TAAGACTATT CTCAGTCTCC ATTGAGGGCA 


1841 


TGCGGTGCTG TTAACGGTGG AGGGCAGTGI AGTCTGAGCA GTACTCGTTG CTGCCGCGCG CGCCACCAGA CATAATAGCT 
ACGCCACGAC AATTGCCACC TCCCGTCACA TCAGACTCGT CATGAGCAAC GACGGCGCGC GCGGTGGTCT GTATTATCGA 


+ 2 


M A A 

EcoRI 


1921 


GACAGACTAA CAGACTGTTC CTTTCCATGG GTCTTTTCTG CAGTCACCGT CGTCGACCTA AGAATTCAOC ATGGCTGCAT 
/—rr"rr"rT'i%.T»r i-tct/- fti&ftftrsTa^*r' TAAAAAAfiAC GTCAGTGGCA GCAGCTGGAT TCTTAAGTGG TACCGACGTA 


♦2 
2001 


YAAQ GYK VLVL NPS VAA TLGF GAV MSK 
ATGCAGCTCA GGGCTATAAG GTGCTAGTAC TCAACCCCTC TGTTGCTGCA ACACTGGGCT TTGGTGCTTA CATGTCCAAG 
"rn^j-^p/^/^Ar-T nr^nf-KT^TTn f^m^'Tr^Tfi Ar^TTGAGGAfi AfAACGACGT TGTGACCCGA AACCACGAAT GTACAGGTTC 


+2 
2081 


AHGI DP N IRT GVRT ITT GSP ITYS T YG 
GCTCATGGGA TCGATCCTAA CATCAGGACC GGGGTGA6AA CAATTACCAC TGGCAGCCCC ATCACGTACT CCACCTACGG 
nr^rT^nrr^r arrTA^^ATT GTAGTrrTGG rrrrACTCTT GTTAATGGTG ACCGTCGGGG TAGTGCATGA GGTGGATGCC 


+2 
2161 


KFL ADGG CSG GAY Dili CDE CHS TDA 
CAAGTTCCTT GCCGACGGCG GGTGCTCGGG GGGCGCTTAT GACATAATAA TTTGTGACGA GTGCCACTCC ACGGATGCCA 
r"rTr-aarvaa nrnr*'rrrrrr rr^mjkf^CC rrCGCGAATA CTGTATTATT AAACACTGCT CACGGTGAGG TGCCTACGGT 


+2 
2241 


TSIL GIG rVLO QAE TAG ARLV V LA TAT 
CATCCATCTT GGGCATTGGC ACTGTCCTTG ACCAAGCAGA GACTGCGGGG GCGAGACTGG' TTGTGCTCGC CACCGCCACC 
rTarr*Taraa rrrrT^^rm TGArAGGAAC TG(5TTCGTCT CTGACGCCCC CGCTCTGACC AACACGAGCG GTGGCGGTGG 


+2 
2321 


PPGS VTV PHP NIEE VAL STT GEIP FYG 
CCTCCGGGCT CCGTCACTGT GCCCCATCCC AACATCGAGG AGGTTGCTCT GTCCACCACC GGAGAGATCC CTTTTTACGG 
GGAGGCCCGA GGCAGTGACA CGGGGTAGGG TTGTAGCTCC TCCAACGAGA CAGGTGGTGG CCTCTCTAGG GAAAAATGCC 


+2 
2401 


KAI PLEV IKG GRH LIFC HSK KKC DEL 
CAAGGCTATC CCCCTCGAAG TAATCAAGGG GGGGAGACAT CTCATCTTCT GTCATTCAAA GAAGAAGTGC GACGAACTCG 
GTTCCGATAG GGGGAGCTTC ATTAGTTCCC CCCCTCTGTA GAGTAGAAGA CAGTAAGTTT CTTCTTCACG CTGCTTGAGC 


+2 
2481 


AAKL VAL GINA VAY YRG LOVS VIP TSG 
CCGCAAAGCT GGTCGCATTG GGCATCAATG CCGTGGCCTA CTACCGCGGT CTTGACGTGT CCGTCATCCC GACCAGCGGC 
GGCGTTTCGA CCAGCGTAAC CCGTAGTTAC GGCACCGGAT GATGGCGCCA GAACTGCACA GGCAGTAGGG CTGGTCGCCG 


+2 
2561 


DVVV VAT DAL MTGY TGO FDS VIDC NTC 
GATGTTGTCG TCGTGGCAAC CGATGCCCTC ATGACCGGCT ATACCGGCGA CTTCGACTCG GTGATAGACT GCAATACGTG 
CTACAACAGC AGCACCGTTG GCTACGGGAG TACTGGCCGA TATGGCCGCT GAAGCTGAGC CACTATCTGA CGTTATGCAC 



13/100 



wo 01/38360 



PCT/USOO/32326 



pCMV-delNS35 

FIGURE 5 - Page 3 

♦2 VTQ TVOr SLD PTF TIET ITL PQO AVS 
2641 TGTCACOCAG ACAGTCGATT TCAGCCTTGA CCCTACCTTC ACCATTGAGA CAATCACGCT CCCCCAAGAT GCTGTCTCCC 
ACAGTGC3GTC TGTCAGCTAA AGTCGGAACT GGGATGGAAG TGGTAACTCT GTTAGTGCGA GGGGGTTCTA CGACAGAGGG 

♦2RTQR -RGR TGRG KPG lYR FVAP GER PSG 
2721 GCACTCAACG TCGGGGCAGG ACTGGCAGGG GGAAGCCAGG CATCTACAGA TTTGTGGCAC CGGGGGAGCG CCCCTCCGGC 
CGTGAGTTGC AGCCCCGTCC TGACCGTCCC CCTTCGGTCC GTAGATGTCT AAACACCGTG GCCCCCTCGC GGGGAGGCCG 

+2MFDS SVL GEO YDAG CAW YSL TPAE TTV 
2801 ATGTTCGACT CGTCCGTCCT CTGTGAGTGC TATGACGCAG GCTGTGCTTG GTATGAGCTC ACGCCCGCCG AGACTACAGT 
TACAAGCTGA GCAGGCAGGA GACACTCACG ATACTGCGTC CGACACGAAC CATACTCGAG TGCGGGCGGC TCTGATGTCA 



*2 RLR AYMN TPG LPV CQOH.LEF WSG VFT 

StuI 

28 Bl TAGGCIACGA GCGTACATGA ACACCCCGGG GCTTCCCGTG TGCCAGGACC ATCTTGAATT TTGGGAGGGC GTCTTTACAG 
ATCCGATGCT CGCATGTACT TGTGGGGCCC CGAAGGGCAC ACGGTCCTGG TAGAACTTAA AACCCTCCCG CAGAAATGTC 



+2GLTH IDA HFLS QTK QSG ENLP YLV AYQ 
StuI 

2961 GCCTCACTCA TATAGATGCC CACTTTCTAT CCCAGACAAA GCAGAGTGGG GAGAACCTTC CTTACCTGGT AGCGTACCAA 
CGGAGTGAGT ATATCTACGG G7GAAAGATA GGGTCTGTIT CGTCTCACCC CrCTTGGAAG GAATGGACCA TCGCATGGTT 

^2ATVC ARA QAP PPSW DQM WKC LIRL KPT 
3041 GCCACCGTGT GCGCTAGGGC TCAAGCCCCT CCCCCATCGT GGGACCAGAT GTGGAAGTGT TTGATTCGCC TCAAGCCCAC 
CGGTGGCACA CGCGATCCCG AGTTCGGGGA GGGGGTAGCA CCCTGGTCTA CACCTTCACA AACTAAGCGG AGTTCGGGTG 

+2 LHG PTPL LYR LGA VQNE ITL THP VTK 
3121 CCTCCATGGG CCAACACCCC TGCTATACAG ACTGGGCGCT GTTCAGAATG AAATCACCCT GACGCACCCA GTCACCAAAT 
GGAGGTACCC GGTTGTCGGG ACGATATGTC TGACCCGCGA CAAGTCTTAC TTTAGTGGGA CTGCGTGGGT CAGTGGTTTA 



^.2YIMT CMS ADLE VVT STW VLVG GVL AAL 
3201 ACATCATGAC ATGCATGTCG GCCGACCTGG AGGTCGTCAC GAGCACCTGG GTGCTCGTTG GCGGCGTCCT GGCTGCTTTG 
TGTAGTACTG TACGTACAGC CGGCTGGACC TCCAGCAGTG CTCGTGGACC CACGAGCAAC CGCCGCAGGA CCGACGAAAC 



+2AAYC LST GCV VIVG RVV LSG KPAI IPD 
3281 GCCGCGTATT GCCTGTCAAC AGGCTGCGTG GTCATAGTGG GCAGGGTCGT CTTGTCCGGG AAGCCGGCAA TCATACCTGA 
CGGCGCATAA CGGACAGTTG TCCGACGCAC CAGTATCACC CGTCCCAGCA GAACAGGCCC TTCGG CCGTT AGTATGGAC7 

+2 REV LYRE FDE MEE CSQH LPY lEQ GMM 
3361 CAGGGAAGTC CTCTACCGAG AGTTCGATGA GATGGAAGAG TGCTCTCAGC ACTTACCGTA CATCGAGCAA GGGATGA.>.v^ 
GTCCCTTCAG GAGATGGCTC TCAAGCTACT CTACCTTCTC ACGAGAGTCG TGAATGGCAT GTAGCTCGTT CCCTACTACG 



+2LAE0 FKQ KALG LLQ TAS RQAE VIA PAV 
3441 TCGCCGAGCA GTTCAAGCAG AAGGCCCTCG GCCTCCTGCA GACCGCGTCC CGTCAGGCAG AGGTTATCGC CCCTGCTGTC 
Ic^aCTCGT CAAGTTCGTC TTCCGCGAGC CGGAGGACGT CTGGCGCAGG GCAGTCCGTC TCCAATAGCG GGGACGACAG 



♦•2 0tnw qkl etf wakh mwn fis giqy lag 
raflaccaact ggcaaaaact cgagaccttc tgggcgaagc atatgtggaa cttcatcagt gggatacaat acttggcggg 
gtc?g§^gI ccg?^S^ otct ^IIg Icccgcttcg tatacacctt gaagtagtca ccctatgtta tgaaccgccc 

*7 LST LPGN PAI ASL MAFT AAV TSP LTT 
3601 CTTGTCAACG CTGCCTGGTA ACCCCGCCAT TGCTTCATTG ATGGCTTTTA CAGCTGCTGT CACCAGCCCA ^TAACCACTA 
GAACAGTTGC GACGGACCAT TGGGGCGGTA ACGAAGTAAC TACCGAAAAT GTCGACGACA GTGGTCGGGT .GATTGGTGAT 

368l^ LcSaACCCT CcicT?CA^C AkTTGGGGG^'GGTGGG^GG^ TGCCcSgC^C ^^^EE^^/g^^^^ IcllIIcIc 
CGGTTTGGGA GGAGAAGTTG TATAACCCCC CCACCCACCG ACGGGTCGAG CGGCGGGGGC CACGGCGATG ACGGAAACAC 
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^5^^^ AGA A IG. SVGL GKV LID ILAG YGA 
3761 . GGCGCTGGCT TAGCTGGCGC CGCCATCGGC AGTGTTGGAC TGGGGAAGGr CCTCATAGAC ATCCTTGCAG GGTATGGCtsr 
CCGCGACCGA ATCGACCGCG GCGGTAGCCG TCACAACCTG ACCCCTTCCA GGAGTATCTG TAGGAACGTC CCATACCGCG 

,o.t^ ^ ^ G ALV AFK IMS GEVP STB DLV NLL 

3841 GCGCGTGGCG GGAGCTCTTG TGGCATTCAA GATCATGAGC CGTGAGGTCC CCTCCACGGA GGACCTGGTC AATCTACTGC 
CCCGCACCGC CCTCGAGAAC ACCGTAAGTT CTAGTACTCG CCACTCCAGG GGAGGTGCCT CCTGGACCAG TTAGATGACG 



"■^PAIL SPG ALVV G V V CAA ILRR HVG PGE 
3921 CCGCCATCCT CTCGCCCGGA GCCCTCGTAG TCGGCGTGGT CTGTGCAGCA ATACTGCGCC GGCACGTTGG CCCGGGCGAG 
GGCGGTAGGA GAGCGGGCCT CGGGAGCATC AGCCGCACCA GACACGTCGT TATGACGCCG CCGTGCAACC GGGCCCGCTC 



+ 2GAVQ HMN RLI AFAS RGN HVS PTHY VPE 
4001 GGGGCAGTGC AX3TGGATGAA CCGGCTGATA GCCTTCGCCT CCCGGGGGAA CCATGTTTCC CCCACGCACT ACGTGCCGGA 
CCCCGTCACG TCACCTACTT GGCCGACTAT CGGAAGCGGA GGGCCCCCTT GGTACAAAGG GGGTGCGTGA TGCACGGCCT 



+2 SOA AARV TAI LSS LTVT QLL RRL HQW 
4081 GAGCGATGCA GC7GCCCGCG TCACTGCCAT ACTCAGCAGC CTCACTGTAA CCCAGCTCCT GAGGCGACTG CACCAGTGGA 
CTCGCTACGT CGACGGGCGC AGTGACGGTA TGAGTCGTCG GAGTGACATT GGGTCGAGGA CTCCGCTGAC GTGGTCACCT 



*2ISSE CTT PCSG S»L RDI WDWI CEV ISO 
4161 TAAGCTCGGA GTGTACCACT CCATGCTCCG GTTCCTGGCT AAGGGACATC TGGGACTGGA TATGCGAGGT GTTGAGCGAC 
ATTCGAGCCT CACATGGTGA GGTACGAGGC CAAGGACCGA TTCCCTGTAG ACCCTGACCT ATACGCTCCA CAACTCGCTG 



+2rKTW LKA KLM PQLP GIP T VS CQRG YKG 

BamHI 

4241 TTTAAGACCT GGCTAAAAGC TAAGCTCATG CCACAGCTGC CTGGGATCCC CTTTGTGTCC TGCCAGCGCG GGTATAAGGG 
AAATTCIGGA CCGATTTTCG ATTCGAGTAC GGTGTCGACG GACCCTAGGG GAAACACAGG ACGGTCGCGC CCATATTCCC 



-^2 VWR GOGI MHT RCH CGAE ITG HVK NGT 
4321 GGTCTGGCGA GGGGACGGCA TCATGCACAC TCXKTGCCAC TGTGGAGCTG ACATCACTGG ACATGTCAAA AACGGGACGA 
CGAGACCGCT CCCCTGCCGT AGTACGTGTG AGCGACGGTG ACACCTCGAC TCTAGTGACC TGTACAGTTT TTGCCCTGCT 



-»-2MRI7 GPR TCRN MHS GTF PIMA YTT GPC 
4401 TGAGGATCGT CGGTCCTAGG ACCTGCAGGA ACATGTGGAG TGGGACCTTC CCCATTAATG CCTACACCAC GGGCCCCTGT 
ACTCCTAGCA GCCAGGATCC TGGACGTCCT TGTACACCTC ACCCTGGAAG GGGTAATTAC GGATGTGGTG CCCGGGGACA 



+2tPLP APN YTF ALWR VSA SEY VEIR QV3 
4481 ACCCCCCTTC CTGCGCCGAA CTACACGTTC GCGCTATGGA GGGTGTCTGC AGAGGAATAC GTGGAGATAA GGC7KGG7GGG 
TGGGGGGAAG GACGCGGCTT GATGT6CAAG CCCGATACCT CCCACAGACG TCrCCTTATG CACCTCTATT CCGTCCACCC 



+2 DFH YVTG MTT DNL KCPC QVP SPE FFT 
4561 GGACTTCCAC TACGTGACGG GTATGACTAC TGACAATCTT AAATGCCCGT GCCAGGTCCC ATCGCCCGAA TTTTTCACAG ' 
CCTGAAGGTG ATGCACTGCC CATACTGATG ACTGTTAGAA TTTACGGGCA CGGTCCAGGG TAGCGGGCTT AAAAAGTG7C 



+ 2ELDG VRL HRFA PPC K P h LRBE VSF RVG 
4641 AArXGGACGG GGTGCGCCTA CATAGGTTTG CGCCCCCCTG CAAGCCCTTG CTGCGGGAGG AGGTATCATT CAGAGTAGGA 
TTAACCTGCC CCACGCGGAT GTATCCAAAC GCGGGQGGAC GTTCGGGAAC GACGCCCTCC TCCATAGTAA GTCTCATCCT 



+2LHEY PVG SQL PCEP EPO VAV LTSM LTD 
4121 CTCCACGAAT ACCCGGTAGG GTCGCAATTA CCTTGCGAGC CCGAACCGGA CGTGGCCGTG TTGACGTCCA TGCTCACTGA 
GAGGTGCTTA TGGGCCATCC CAGCGTTAAT GGAACGCTCG GGCTTGGCCT GCACCGGCAC AACTGCAGGT ACGAGTGACT 



*2 PSH ITAE AAG RRL ARGS PPS VAS SSA 
4801 TCCCTCCCAT ATAACAGCAG AGGCGGCXGG GCGAAGGTTG GCGAGGGGAT CACCCCCCTC TGTGGCCAGC TCCTCGGCTA 
AGGGAGGGIA TATTGTCGTC TCCGCCGGCC CGCTTCCAAC CGCTCCCCTA GTGGGGGQAG ACACCGGTCG AGGAGCCGAT 
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♦2SQLS APS LKAT CTA NHD SPOA ELl fam 
4881 GCCAGCTATC CGCTCCATCT CTCAAGGCAA CTTGCACCGC TAACCATCAC TCCCXTGATG CTGAGCTCAT AGArrrrAAr 
CGGTCGATAG GCGAGGTAGA GAGTTCCGTt GAACGTGGCG ATTGGTACTG AGGGGACTAC GACTCGAGTA TCTCCGGTTG 

.^^t^ .L^**^**^^" CGN ITBV ESE NKV VILO SFD 
4961 CTCCTATGGA GGCACGAGAT CCGCGGCAAC ATCACCACGG TTGAGTCAGA AAACAAAGTG GTGATTCTGG ACTCCTTCGA 
CAQGATACCT CCGTCCTCTA CCCGCCGTTG TAGTGGT CCC AACTCAGTCT TTTGTTTCAC CACTAAGACC TGAGGAAG^ 

+ 2 PLV AEEO ERE ISV PAEI LRK SRR FAO 
5041 TCCGCTT6TG GCGGAGGAGG ACGAGCCGGA GATCTCCGTA CCCGCAGAAA TCCTGCGGAA GTCTCGGAGA TTCGCCCAGG 
AGGCGAACAC CGCCTCCTCC TGCTCGCCCT CTAGAGGCAT GGGCGTCTTT AGGACGCCTT CAGAGCCTCT AAGCGGGTCC 



*2 ALPV MAR POYN PPL VET W. KKP DYE ??V 

5121 CCCTGCCCGT TTGG6CGCCC CCCGACTATA ACCCCCCGCT AGTGGAGACG TGGAAAAAGC CCGACTACGA ACCACCTGTG 

GGGACGGGCA AACCCGCGCC GGCCTGATAT TGGGGGGCGA TCACCTCTGC ACCTTTTTCG GGCTGATGCT TGGTGGACAC 

e^.t^ PLP PPK SPPV PPP RKK RTVV LIE 

5201 GTCCATGGCT GCCCGCTTCC ACCTCCAAAG TCCCCTCCTG TGCCTCCGCC TCGGAAGAAG CGGACGGrGG TCCTCACTGA 

CAGGTACCGA CGGGCGAAGG TGGAGGTTTC AGGGGAGGAC ACGGAGGCGG AGCCTTCTTC GCCTGCCACC AGGAGTGACT 

+ 2 STL STAL AEL ATR SFGS SST SGI TGD 

5281 ATCAACCCTA TCTACTGCCT TGGCCGAGCT CGCCACCACA AGCTTTCCCA GCTCCTCAAC TTCCGGCATT ACGGGCGACA 

TAGTTGGGAT AGATGACGGA ACCGGCTCGA GCGGTGGTCT TCGAAACCGT CGAGGAGTTG AAGGCCGTAA TGCCCGCTGT 



+2NTrT SSE PAPS GCP PDS DAES YSS MPP 
5361 ATACGACAAC ATCCTCTGAG CCCGCCCCTT CTGGCTGCCC CCCCGACTCC GACGCTGAGT CCTATTCCTC CATGCCCCCC 
TATGCTGTTG TAGGAGACTC GGGCGGGGAA GACCGACGGG GGGGCTGAGG CTGCGACTCA GGATAAGGAG GTACGGGGGG 

+2 LEGE PGD PDL SOGS MST VSS EANA EOV 
BamHI 



5441 CTGGAGGGGG AGCCTGGGGA TCCGGATCTT AGCGACGGGT CATGGTCAAC GGTCAGTAGT GAGGCCAACG CGGAGGATGT 
GACCTCCCCC TCGGACCCCT AGGCCTAGAA TCGCTGCCCA GTACCAGTTG CCAGTCATCA CTCCGGTTGC GCCTCCTACA 



^2 VCC SMSY SWT GAL VTPC AAE EQK LPI 
5521 CGTGTGCTGC TCAATGTCTT ACTCTTGGAC AGGCGCACTC GTCACCCCGT GCGCCGCGGA AGAACAGAAA CTGCCCATCA 
GCACACGACG AGTTACAGAA TGAGAACCTG TCCGCGTGAG CAGTGGGGCA CCCQGCGCCT TCTTGTCTTT GACGGCTAGT 



+ 2NALS NSL LRHH NLV YST TSRS ACQ RQK 
5601 AIGCACTAAG CAACTCGTTG CTACGTCACC ACAATTTGGT GTATTCCACC ACCTCACGCA GTGCTTGCCA AAGGCAGA;iG 
TACGTGATTC GTTGAGCAAC GATGCAGTGG TGTTAAACCA CATAAGGTGC TGGAGTGCGT CACGAACGGT TTCCGTCTTC 



♦ 2KVTF DRL QVL DSHY QDV LKE VKAA ASK 
5681 AAAGTCACAT TTGACAGACT GCAAGTTCTG GACAGCCATT ACCAGGACGT ACTCAAGGAG GTTAAAGCAG CGGCGTCAAA 
TTTCAGTGTA AACTGTCTGA CGTTCAAGAC CTGTCGGTAA TGGTCCTGCA TGAGTTCCTC CAATTTCGTC GCCGCAGTTT 



+2 VKA HLLS VEE ACS LTPP HSA KSK FGY 
5761 AGTGAAGGCT AACTTGCTAT CCGTAGAGGA AGCTTGCAGC CTGACGCCCC CACACTCAGC CAAATCCAAG TTTGGTTATG 
TCACTTCCGA TTGAACGATA GGCATCTCCT TCGAACGTCG GACTGCGGGG GTCTGAGTCG GTTTAGGTTC AAACCAATAC 



+ 2GAKD VRC HARK AVT HIN SVWK DLL EON 
5841 GGGCAAAAGA CGTCCGTTGC CATGCCAGAA AGGCCGTAAC CCACATCAAC TCCGTGTGGA AAGACCTTCT GGAAGACAAT 
CCCGTTTTCT GCAGGCAACG GTACGGTCTT TCCGGCATTG GGTGTAGTTG A3GCACACCT TTCTGGAAGA CCTTCTGTTA 



+2VTPI DTT IMA KNEV FCV QPE KGGR KPA 
5921 GTAACACCAA TAGACACTAC CATCATGGCT AAGAACGAGG TTTTCTGCGT TCAGCCTGAG AAGGGGGGIC GTAAGCCAGC 
CATTGTGGTT ATCTGTGATG 6TAGTACCGA TTCTTGCTCC AAAAGACGCA AGTCGGACTC TTCCCCCCAG CATTCGGTCG 
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+2 RLI VFPD L G V RVC EKMA LYD VV? KLP 
6001 TCGTCTCATC GTGTTCCCCG ATCTGGGCGT GCGCGTGTGC GAAAAGATGG CTTTGTACGA CGTGGTTACA AAGCTCCCCT 
AGCAGJVGTAG CACAAGGGGC TAGACCCGCA CGCGCACACG CTTTTCTACC GAAACATGCT GCACCAATGT TTCGAGG6GA 



+2LAVM.GSS YGFO YSP GOR VEFL VQA WKS 

EcoRI 



e081 TGGCCGTGAT GGGAAGCTCC TACGGATTCC AATACTCACC AGGACAGCGG GTTGAATTCC TCGTGCAAGC GTGGAAGTCC 
ACCGGCACTA CCCTTCGAGG ATGCCTAAGG TTATGAGTGG TCCTGTCGCC CAACTTAAGG AGCACGTTCG CACCTTCAGG 



+ 2KKTP MGF SYD TRCF DST VTE SDIR T£E 
6161 AAGAAAACCC CAATGGGGTT CTCGTATGAT ACCCGCTGCT TTGACTCCAC AGTCACTGAG AGCGACATCC GTACGGAGGA 
TTCTTTTGGG GTTACCCCAA GAGCATACTA TGGGCGACGA AACTGAGGTG TCAGTGACTC TCGCTGTAGG CATGCCTCCT 



♦2 AIY QCCD LDP QAR VAIK SLT ERL YVG 
6241 GGCAATCTAC CAATGTTGTG ACCTCGACCC CCAAGCCCGC GTGGCCATCA AGTCCCTCAC CGAGAGGCTT TATGTTGGGG 
CCGTTAGATG GTTACAACAC TGGAGCTGGG GGTTCGGGCG CACCGGTAGT TCAGGGAGTG GCTCTCCGAA ATACAACCCC 



+ 2GPLT NSR GENC GYR RCR ASGV LTT SCO 
6321 GCCCTCTTAC CAATTCAAGG GGGGAGAACT GCGGCIATCG CAGGTGCCGC GCGAGCGGCG TACTGACAAC TAGCTGTGGT 
CGGGAGAATG GTTAAGTTCC CCCCTCTTGA CGCCGATAGC GTCCACGGCG CGCTCGCCGC ATGACTGTTG ATCGACACCA 



♦2NTLT CYI KAR AACR AAG LQD CTML VCG 
6401 AACACCCTCA CTTGCTACAT CAAGGCCCGG GCAGCCTGTC GAGCCGCAGG GCTCCAGGAC TGCACCATGC TCGTGTGTGG 
TTGTGGGAGT GAACGATGTA GTTCCGGGCC CGTCGGACAG CTCGGCGTCC CGAGGTCCTG ACGTGGTACG AGCACACACC 



^-2 DDL VVIC ESA GVQ EDAA SLR AFT EAM 
6481 CGACGACTTA GTCGTTATCT GTGAAAGCGC GGGGGTCCAG GAGGACGCGG CGAGCCTGAG AGCCTTCACG GAGGCTATGA 
GCTGCTGAAT CAGCAATAGA CACTTTCGCG CCCCCAGGTC CTCCTGCGCX: GCTCGGACTC TCGGAAGTGC CTCCGATACT 



f2TRYS APP GOPP QPS YDL BLIT SCS SNV 
6561 CCAGGTACTC CGCCCCCCCT GGGGACCCCC CACAACCAGA ATACGACTXG GAGCTCATAA CATCATGCTC CTCCAACGTG 
GGTCCATGAG GCGGGGGGGA CCCCTGGGGG GTGTTGGTCT TATGCTGAAC CTCGAGTATT GTAGTACGAG GAGGT7GCAC 



+2SVAH DGA GKR VYYL TRD PTT PLAR AAW 
6641 TCAGTCGCCC ACGACGGCGC TGGAAAGAGG GTCTACTACC TCACCCGrCA CCCTACAACC CCCCTCGCGA GAGCTGCGTG 
AGTCAGCGGG TGCTGCCGCG ACCTTTCTCC CAGATGATGG AGTGGGCACT GGGArGTTGG GGGGAGCGCT CTCGACGCAC 



*2 ETA RHTP VNS WLG NIXM FAP TLW ARM 
6721 GGAGACAGCA AGACACACTC CAGTCAATTC CTGGCTAGGC AACATAATCA TGTTTGCCCC CACACTGTGG GCGAGGATGA 
CCrCtGTCGT TCTGTGTGAG GTCAGTTAA6 GACCGATCCG TTGTATTAGT ACAAACGGGG GTGTGACACC CGCTCCTACT 



•I-2ILMT HFf SVLt ARD QLE QALD CEI YGA 
6801 TACTGATGAC CCATTTCTTT AGCGTCCTTA TAGCCAGGGA CCAGCTTGAA CAGGCCCTCG ATTGCGAGAT CTACGGGGCC 
ATGACTACT6 GGTAAAGAAA TCGCAGGAAT ATCGGTCCCT GGTCGAACTT GTCCGGGAGC TAACGCTCTA GATGCCCCGG 



■f>2CYSr EPL OLP P ZIQ RLH G L $ AfSL HSY 
6881 TGCTACTCCA TAGAACCACT GGATCTACCT CCAATCATTC AAAGACTCCA TGGCCTCAGC GCATTTTCAC TCCACAGTTA 
ACGATGAGGT ATCTTGGTGA CCTAGATGGA GGTTAGTAAG TTTCTGAGG7 ACCGGAGTCG C6TAAAAGTG AGGTGTCAAT 



*2 SPG EXtIR VAA CLR KLGV PPL RAW RHR 
6961 CTCTCCAGGT GAAATCAATA GGGTGGCCGC ATGCCTCAGA AAACTTGGGG TACCGCCCTT GCGAGCTTGG AGACACCGGG 
GAGAGGTCCA CTTTAGTTAT CCCACXGGCG TACGGAGTCT TTTGAACCCC ATGGCGGGAA CGCTCGAACC TCTGTGGCCC 



-(-2ARSV RAR LLAR GGR AAI CGKY LFN WAV 
7041 CCCGGAGCGT CCCCCCTAGG CTTCTGCCCA GAGGAGGCAC GGCTGCCATA TGTGGCAAGT ACCTCTTCAA CTGGGCAGTA 
GGGCCTCGCA GGCGCGATCC GAAGACCGGT CTCCTCCGTC CCGACGGTA7 ACACCGTTCA T6GAGAAGTT GACCCGTCAT 
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+2RTKL KLT PIA AAGQ LOL SGW FTAG YSG 
7121 A6AACAAAGC TCAAACTCAC TCCAATAGCG GCCGCTGGCC AGCTG6ACTT GTCCGGCTGG TTCACGGCTG GCTACAGCGG 
TCTTGTTTCG AGTTTGAGT6 AGGTTATCGC CGGCGACCGG TCCACCTGAA CAGGCCGACC AAGTGCCGAC CGATGTCGCC 



+2 GDI Y HSV SHA RPEl WXNF CLL LLA AGV 
7201 GGGAGACATT TATCACAGCG TGTCTCATGC CCGGCCCCGC TGGATCTGGT TTTGCCTACT CCTGCTTGCT GCAGGGGTAG 
CCCTCTGTAA ATAGTGTCGC ACAGA6TACG GGCCGGGGCG ACCTAGACCA AAACGGATGA GGACGAACGA CGTCCCCATC 



♦ 2GrYL LPN R 
7281 GCATCTACCT CCTCCCCAAC CGATGAAGGT TGGGGTAAAC ACTCCGGCCT AAAAAAAAAA AAAAATCTAG AAACGCGCCC 
CGTAGATGGA GGAGGGGTTG GCTACTTCCA ACCCCATTTG TGAGGCCGGA tTTTTTTTTT TTTTTAGATC TTTCCGCGCG 



BamHI Mlul 



7361 CAAGATATCA AGGATCCACT ACGCGTTAGA GCTCGCTGAT CAGCCTCGAC TGTGCCTTCT AGTTGCCAGC catctgttgt 
GTTCTATAGT TCCTAGGTGA TGCGCAATCT CGAGCGACTA GTCGGAGCTG ACACGGAAGA TCAACGGTCG GTAGACAACA 



7441 ttgcccctcc cccgtgcctt ccttgaccct ggaaggtgcc actcccactg tcctttccta ataaaa1gag gaaattgcat 
aacggggagg gggcacggaa ggaactggga ccttccacgg tgagggtgac aggaaaggat tattttactc ctttaacgta 



7521 CGCATTGTCr gagtaggtgt cattcTattc tggggggtgg ggtggggcac gacagcaagg gggaggattg ggaagacaat 

GCGTAACAGA CTCATCCACA GTAAGATAAG ACCCCCCACC CCACCCCGTC CTGTCGTTCC CCCTCCIAAC CCTTCTGTTA 



7601 AGCAGGCATG CTGGGGAGCT CTTCCGCTTC CTCGCTCACT GACTCGCTGC GCTCCGTCGT TCGGCTGCGG CGAGCGGTAT 
TCGTCCGTAC GACCCCTCGA GAAGGCGAAG GAGCGAGTGA CTGAGCGACG CGAGCCAGCA AGCXTGACGCC GCTCGCCATA 



7681 CAGCTCACTC AAAGGCGGTA ATACGGTTAT CCACAGAATC AGGGGATAAC GCAGGAAAGA ACATGTGAGC AAAAGGCCAG 
GTCGAGTGAG TTTCCGCCAT TATGCCAATA GGT.GTCTTAG TCCCCTATTG CGTCCTTTCT TGTACACTCG TTTTCCGGTC 



7761 CAAAAGGCCA GGAACCGTAA AAAGGCCGCG TTGCTGGCGT TTTTCCATAG GCTCCGCCCC CCTGACGAGC ATCACAAAAA 
GTTTTCCGGT CCTTGGCATT TTTCCGGCGC AACGACCGCA AAAAGGTA7C CGAGGCGGGG GGACTGCTCG TAGTCTTTTT 



7841 TCGACGCTCA AGTCAGAGGT GGCGAAACCC GACAGGACTA TAAAGATACC AGGCGTTTCC CCCTGGAAGC TCCCTCGTGC 
AGCTGCGAGT TCAGTCTCCA CCGCTTTGGG' CTGTCCTGAT ATTTCTATGG TCCGCAAAGG GGGACCTTCG AGGGAGCACG 



7921 GCTCTCCTGT TCCGACCCTG CCGCTTACCG GATACCTGTC CGCCTTTCTC CXTTCGGGAA GCGTGGCGCT TTCTCAATGC 
CGAGAGGACA AGGCTGGGAC GGCGAATGGC CTATGGACAG GCGGAAAGAG GGAAGCCCTT CGCACCGCGA AAGAGTTACG 



8001 TCACGCTGTA GGTATCTCAG TTCGGTGTAG GTCGTTCGCT CCAAGCTGGG CTGTGTGCAC GAACCCCCCG TTCAGCCCGA 
AGTGCGACAT CCATAGAGTC AAGCCACATC CAGCAAGCGA GGTTCGACCC GACACACGTG CTTGCGGGGC AAGTCGGGCT 



8081 CCGCTGCGCC TTATCCGGTA ACTATCGTCT TGAGTCCAAC CCGGTAAGAC ACGACTTATC GCCACTGGCA GCAGCCACTG 
GGCGACGCGG AATAGGCXAT TGATAGCAGA ACTCAGCTTG GGCCATTCTG TGCT6AATAG CGGTGACCGT CGTCGCTGAC 



8161 GTAACAGGAT TAGCAGAGCG AGGTATGTAG GCGGTGCTAC AGAGTTCTTG AAGT6GTGGC CTAACTACGG CTACACTAGA 
CATTGTCCTA ATCGTCTCGC TCCATACATC CGCCACGATG TCTCAAGAAC TTCACCACCG GATTGATGCC GATGTGATCT 



8241 AGGACAGTAT TTGGTATCTC CGCTCTGCTG AAGCCAGTTA CCTTCGGAAA AAGAGTTGGT AGCTCTTGAT CCGGCAAACA 
TCCTGTCATA AACCATAGAC GCGAGACGAC TTCGGTCAAT GGAAGCCTTT TTCTCAACCA TCGAGAACTA GGCCGTTTGT 



8321 AACCACCGCT GGTAGCGGTG GTTTTTTTGT TTGCAAGCAG CAGATTACGC GCAGAAAAAA AGGATCTCAA GAAGATCCTT 
tTGGTGGCGA CCATCGCCAC CAAAAAAACA AACGTTCGTC GTCTAATGCG CGTCTTTTTT TCCTAGAGTT CTTCTAGGAA 



8401 TGATCTTTTC TACGGGGTCT GACGCTCAGT GGAACGAAAA CTCACGTTAA GGGATTTTGG TCATGAGATT ATCAAAAAGG 
ACTAGAAAAG ATGCCCCAGA CTGCGAGTCA CCTTGC7TTT GAGTGCAATT CCCTAAAACC AGTACTCTAA TAGTTTTTCC 
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8481 


ATCTTCACCT AGATCCTTTT AAATTAAAAA TGAAGTTTTA AATCAATCTA AAGTATATAT GAGTAAACTt GGTCTGACAG 
TAGAAGTGGA TCTAGGAAAA TITAATTTTT ACTTCAAAAT TTAGTTAGAT TTCATATATA CTCATTTGAA CCAGACTGTC 


8561 


TTACCAATGC TTAATCAGTG AGGCACXTAT CTCA6CGATC TGTCTATTTC GTTCATCCAT AGTTGCCTGA CTCCCCGTCG 
AATGGTTACG AATTAGTCAC TCCGTGGATA GAGTCGCTAG ACAGATAAAG CAAGTAGGTA TCAACGGACT GAGGGGCAGC 


8641 


TGTAGATAAC TACGATAC6G GAGGGCTTAC CATCTGGCCC CAGTGCTGCA ATGATACCGC GAGACCCACG CTCACCGGCT 
ACATCTATTG ATGCTATGCC CTCCCGAATG GTAGACCGGG GTCACGACGT TACTATGGCG CTCTGGGTGC GAGTGGCCGA 


8721 


CCAGATTTAT CAGCAATAAA CCAGCCAGCC GGAAGGGCCG AGCGCAGAAG TGGTCCTGCA ACTTTATCCG CCTCCATCCA 
GGTCTAAATA GTCGTTATTT GGTCGGTCGG CCTTCCCGGC TCGCGTCTTC ACCAGGACGT TGAAATAGGC GGAGGTAGGT 


8801 


GTCTATTAAT TGTTGCCGGG AAGCTAGAGT AAGTAGTTCG CCAGTTAATA GTTTCCGCAA CGTTGTTGCC ATTGCTACAG 
CAGATAATTA ACAACGGCCC TTCGATCTCA TTCATCAAGC GGTCAATTAT CAAACGCGTT GCAACAACGG TAACGATGTC 


8881 


GCATCGTGGT GTCACGCTCG TCGTTTGGTA TGGCTTCATT CAGCTCCGGT TCCCAACGAT CAAGGCGAGT TACATGATCC 
CGTAGCACCA CAGTGCGAGC AGCAAACCAT ACCGAAGTAA GTCGAGGCCA AGGGTTGCTA GTTCCGCTCA ATGTACTAGG 


3961 


cccatgttgt gcaaaaaagc ggttagctcc ttcggtcctc cgatcgttgt cagaagtaag ttggccgcag tgttatcact 
GGGTACAACA cgttttttcg ccaatcgagg aagccaggag gctagcaaca gtcttcattc aaccggcgtc acaatagtga 


9041 


CATGGTTATG GCAGCACTGC ATAATTCTCT TACTGTCATG CCATCCGTAA GATGCTTTTC TGTGACTGGT GAGTACtCAA 
GTACCAATAC CGTCGTGACG TATTAAGAGA ATGACAGTAC GGTAGGCATT CTACGAAAAG ACACTGACCA CTCATGAGTT 


9121 


CCAAGTCATT CTGAGAATAG TGTATGCGGC GACCGAGTTG CTCTTGCCCG GCGTCAATAC GGGATAATAC CGCCCCACAT 
GGTTCAGTAA GACTCTTATC ACATACGCCG CTGGCTCAAC GAGAACGGGC CGCAGTTATG CCCTATTATG GCGCGGTGTA 


9201 


AGCAGAACTT TAAAAGTGCT CATCATTGGA AAACGTTCTT CGGGGCGAAA ACTCTCAAGG ATCTTACCGC TGTTGAGATC 
TCGTCTTGAA ATTTTCACGA GTAGTAACCT TTTGCAAGAA GCCCCGCTTT TGAGAGTTCC TAGAATGGCG ACAACTCTAG 


9281 


CAGTTCGATG TAACCCACTC GTGCACCCAA CTGATCTTCA GCATCTTTTA CTTTCACCAG CGTTTCTGGG TGAGCAAAAA 
GTCAAGCTAC ATTGGGTGAG CACGTGGGTT GACTAGAAGT CGTAGAAAAT GAAAGTGGTC GCAAAGACCC ACTCGTTTTT 


9361 


CAGGAAGGCA AAATGCCGCA AAAAAGGGAA TAAGGGCGAC ACGGAAATGT TGAATACTCA TACTCTTCCT TTTTCAATAT 
GTCCTTCCGT TTTACGGCGT TTTTTCCCTT ATTCCCGCTG TGCCTTTACA ACTTATGAGT ATGAGAAGGA AAAAGTTATA 


9441 


TATTGAAGCA TTTATCAGGG TTATTGTCTC ATGAGCGGAT ACATATTTGA ATGTATTTAG AAAAATAAAC AAATAGGGGT 
ATAACTTCGT AAATAGTCCC AATAACAGAG TACTCGCCTA TGTATAAACT TACATAAATC TTTTTATTTG TTTATCCCCA 



9521 TCCGCGCACA TTTCCCCGAA AAGTGCCACC TGACGTCTAA GAAACCATTA TTATCATGAC ATTAACCTAT AAAAATAGGC 
AGGCGCGTGT AAAGGGGCTT TTCACCGTGG ACTGCAGATT CTTTGGTAAr AATAGTACTG TAATTGGATA TTTTTATCCG 



9601 GTATCACGAG GCCCTTTCGT C 
CATAGTGCTC CGGGAAAGCA G 
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1 


TCGCGCGTTT CGGTGATGAC GGTGAAAACC TCTGACACAT GCAGCTCCCG GAGACGGTCA CAGCTTGTCT GTAAGCGGAT 
AGCGCGCAAA GCCACTACTG OCACTTTTGG AGACTGTGTA CGTCGAGGGC CTCTGCCAGT GTCGAACAGA CATTCGCCTA 


81 


GCCGGGAGCA GACAAGCCCG TCAGGGCGCG TCAGCGGGTG TTGGCGGGTG TCGGGGCTGG CTTAACTATG CGGCATCAGA 
CGGCCCTCGT CTGTTCGGGC AGTCCCGCGC AGTCGCCCAC AACCGCCCAC AGCCCCGACC GAATTGATAC GCCGTAGTCT 


161 


GCAGATTGTA CTGAGAGTGC ACCATATGAA GCTTTTTGCA AAAGCCTAGG CCTCCAAAAA AGCCTCCTCA CTACTTCTGG 
CGTCTAACAT GACTCTCACG TGGTATACTT CGAAAAACGT TTTCGGATCC GGAGGTTTTT TCGGAGGAGT GAXGAAGACC 


241 


AATAGCTCAG AGGCCGAGGC GGCCTCGGCC TCTGCATAAA TAAAAAAAAT TAGTCAGCCA TGGGGCGGAG AATGGGCGGA 
TTATCGAGTC TCCGGCTCCG CCGGAGCCGG AGACGTATTT ATTTTTTTTA ATCAGTCGGT ACCCCGCCTC TTACCCGCC? 


321 


ACTGGGCGGG QAGGGAATTA TTGGCTATTG GCCATTGCAT ACGTTGTATC TATATCATAA TATGTACATT TATATTGGCT 
TGACCCGCCC CTCCCTTAAT AACCCATAAC CGGTAACGTA TGCAACATAG ATATAGTATT ATACATGTAA ATATAACCGA 


401 


CATGTCCAAT ATGACCGCCA TGTTGACATT GATTATTGAC TAGTTATTAA TAGTAATCAA TTACGGGGTC ATTAGTTCAT 
GTACAGGTTA TACTGGCGGT ACAACTGTAA CTAATAACTG ATCAATAATT ATCATTAGTT AATGCCCCAG TAATCAAGTA 


481 


AGCCCATATA TGGAGTTCCG CGTTACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG CCCAACGACC CCCGCCCATT 
TCGGGTATAT ACCTCAAGGC GCAATGTATT GAATGCCATT TACCGGGCGG ACCGACTGGC GGGTTGCTGG GGGCGGGTAA 


561 


GACGTCAATA ATGACGTATG TTCCCATAGT AACGCCAATA GGGACTTTCC ATTGACGTCA ATGGGTGGAG TATTTACGGT 
CTGCAGTTAT TACTGCATAC AAGGGTATCA TTGCGGTTAT CCCTGAAAGG TAACTGCAGT TACCCACCTC ATAAATGCCA 


641 


AAACTGCCCA CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTCCGCCC CCTATTGACG TCAATGACGG TAAATGGCCC 
TTTGACGGGT GAACCGTCAT GTAGTTCACA TAGTATACGG TTCAGGCGGG GGATAACTGC AGTTACTGCC ATTTACCGGG 


721 


GCCTGGCATT ATGCCCAGTA CATGACCTTA CGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA TCGCTATTAC 
CGGACCGTAA TACGGGTCAT GTACTGGAAT GCCCTGAAAG GATGAACCGT CATGTAGATG CATAATCAGT AGCGATAATG 


801 


CATGGTGATG CGGTTTTGGC AGTACACCAA TGGGCGTGGA TAGCGGTTT6 ACTCACGGGG ATTTCCAAGT CTCCACCCCA 
GTACCACTAC GCCAAAACCG TCATGTGGTT ACCCGCACCT ATCGCCAAAC TGAGTGCCCC TAAAGGTTCA GAGGTGGGGT 


881 


TTGACGTCAA TGGGAGTTfG TTTTGGCACC AAAATCAACG GGACTTTCCA AAATCTCGTA ATAACCCCGC CCCGTTGACG 
AACTGCAGTT ACCCTCAAAC AAAACCGTGG TTTTAGTTGC CCTGAAAGGT TTTACAGCAT TATTGGGGCG GGGCAACTGC 


961 


CAAATGGGCG GTAGGCGTGT ACGGTGGGAG GTCIATATAA GCAGAGCTCG TTTAGTGAAC CGTCAGATCG CCTGGAGACG 
GTTTACCCGC CATCCGCACA TGCCACCCTC CAGATATATT CGTCTCGA6C AAATCACTTG GCAGTCTAGC GGACCTCTGC 


1041 


CCATCCACGC TGTTTTGACC TCCATAGAAG ACACCGGGAC CGATCCAGCC TCCGCGGCCG GGAACGGTGC ATTGGAACGC 
GGTAGGTGCG ACAAAACTGG AGGTATCTTC TGTGGCCCTG GCTAGGTCGG AGGCGCCGGC CCTTGCCACG TAACCTTGCG 


1121 


GGATTCCCCG TGCCAAGAGT GACGTAAGTA CCGCCTATAG ACTCTATAGG CACACCCCTT TGGCTCTTAT GCATGCTATA 
CCTAAGGGGC ACG6TTCTCA CTGCATTCAT GGCGGATATC TGAGATATCC GTGTG6GGAA ACCGAGAATA CGTACGATAT 


1201 


CTGTTTTTGG CTTGGGGCCT ATACACCCCC GCTCCTTATG CTATAGGTGR TGGTATAGCT TAGCCTATAG GTGTGGGTTA 
GACAAAAACC GAACCCCGGA TATGTGGGGG CGAGGAATAC GATATCCACT ACCATATCGA ATCGGATATC CACACCCAAT 


1281 


TTGACCATTA TTGACCACTC CCCTATTGGT GACGATACTT TCCATTACTA ATCCATAACA TGGCTCTITG CCACAACTAT 
AACTGGTAAT AACTGGTGAG GGGATAACCA CTGCTATGAA AGGTAATGAT TAGGTATTGT ACCGAGAAAC GGTGTTGATA 


1361 


CTCTATTGGC TATATGCCAA TACTCTGTCC TTCAGAGACT GACACGGACT CTGTATTTTT ACAGOPjTGGG GTCCATTTAT 
GAGATAACCG ATATACGGTT ATGAGACAGG AAGTCTCTGA CTGTGCCTGA GACATAAAAA TGTCCTACCC CAGGTAAATA 


1441 


TATTTACAAA TTCACATATA CAACAACGCC GTCCCCCGTG CCCGCAGTTT TTATTAAACA TAGOGTGCGA TCTCCGACAT 
ATAAATGTTT AAGTGTATAT GTTGTTGCGG CAGGGGGCAC GGGCGTCAAA AATAATTTGT ATCGCACCCT AGAGGCTGTA 



21/100 



wo 01/38360 



PCTAjSOO/32326 



pCMV-II 

FIGURE 7 - Page 2 



1521 


CTCGGGTACG TGTTCCGGAC ATGGGCTCTT CTCCGGTAGC GGCGGAGCTT CCACATCCGA CCCCTGGTCC CATCCGTCCA 
GAGCCCATGC ACAAGGCCTG TACCCGAGAA GAGGCCATCG CCGCCTCGAA GGTGTAGGCT CGGGACCAGG GTAGGCAGGT 


1601 


GCGGCTCftTG GTCGCTCGGC AGCTCCTTGC TCCTAACAGT GGAGGCCAGA CTTAGGCACA GCACAATGCC CACCACCACC 
CGCCGAGTAC CAGCGAGCCG TCGAGGAACG AGGATTGTCA CCTCCGGTCT GAATCCGTGT CGTGTTACGG GTGGTGGTGG 


1681 


AGTGTGCCGC ACAAGGCCGT GGCGGTAGGG TATGTGTCTG AAAATGAGCT CGGAGATTGG GCTCGCACCT GGACGCAGAT 
TCACACGGCG TGTTCCGGCA CCGCCATCCC ATACACA6AC TTTTACTCGA GCCTCTAACC CGAGCGTGGA CCTGCGTCTA 


1 TCI 


nfz^^fiAr*i'*i'K a/^/l^a^^fG<v ar&ahaa^&t GCAntidAGCT gacttcttgt attctgataa gagtcagagg taactcccgt 
CCTTCTGAAT TCCGTCGCCG TCTTCTTCTA CGTCCGTCGA CTCAACAACA TAAGACTATT CTCAGTCTCC ATTGAGGGCA 




TfTflT'fwnff TT^^f*fZCl*VCr anfSflT'AGTnT h^flTCTGAGCA GTACTCGTTG CTGCCQdCCG CGCCACCAGA CATAATAGCT 
AC6CCACGAC AATTCCCACC TCCCGTCACA TCAGACTCGT CATGAGCAAC GACGGCGCGC GCGGTQGTCT GTATTATCGA 




EcoRI 


1921 


GACAGACTAA CAGACTGTTC CTTTCCATGG GTCTTTTCTG CAGTCACCGT CGTCGACCTA AGAATTCAGA CTCGAGCAAG 
CTGTCTGATT GTCTGACAAG GAAAGGTACC CAGAAAAGAC GTCAGTGGCA GCAGCTGGAT TCTTAAGTCT GAGCTCGTTC 




Xbal BamHI Mlul 


2001 


TCTAGAAAGG CGCGCCAAGA TATCAAGGAT CCACTACGCG TTAGAGCTCG CTGATCAGCC TCGACTGTGC CTTCTAGTTG 
AGATCTTTCC GCGCGGTTCT ATAGTTCCTA GGTGATGCGC AATCTCGAGC GACTAGTCGG AGCTGACACG GAAGATCAAC 


20B1 


CCAGCCATCT GTTGTTTGCC CCTCCCCCGT GCCTTCCTTG ACCCTGGAAG GTGCCACTCC CACTGTCCTT TCCTAATAAA 
GGTCGGTAGA CAACAAACGG GGAGGGGGCA CGGAAGGAAC TGGGACCTTC CACGGTGAGG GTGACAGCAA AGGATTATTT 


2161 


ATGAGGAAAT TGCATCGCAT TGTCTGAGTA GGTGTCATTC TATTCTGGGG GGTGGGGTGG GGCAGGACAG CAAGGGGGAG 
TACTCCTTTA ACGTAGCGTA ACAGACTCAT CCACAGTAAG ATAAGACCCC CCACCCCACC CCGTCCTGTC GTTCCCCCTC 


2241 


GATTGGGAAG ACAATAGCAG GCATGCTGGG GAGCTCTTCC GCTTCCTCGC TCACTGACTC GCTGCGCTCG GTCGTTCGGC 
CTAACCCTTC TGTTATCGTC CGTACGACCC CTCGAGAAGG CGAAGGAGCG ACTGACTGAG CGACGCGAGC CAGCAAGCC3 


2321 


TGCGGCGAGC GGTATCAGCT CACTCAAAGG CGGTAATACG GTTATCCACA GAATCAGGGG ATAACGCAGG AAAGAACATG 
ACGCCGCTCG CCATAGTCGA GTGAGTTTCC GCCATTATGC CAATAGGTGT CTTAGTCCCC TATTGCGTCC TTTCTTGTAC 


2401 


TGAGCAAAAG GCCAGCAAAA GGCCAGGAAC CGTAAAAAGG CCGCGTTGCT GGCGTTTTTC CATAGGCTCC GCCCCCCTGA 
ACTCGTTTTC CGGTCGTTTT CCGGTCCTTG GCATTTTrCC GGCGCAACGA CCGCAAAAAG GTATCCGAGG CGGGGGGACT 


2481 


CGAGCATCAC AAAAATCCAC GCTCAAGTCA GAGGTGGCGA AACCCGACAG GACTATAAAG ATACCAGGCG TTTCCCCCTG 
GCTCCTAGTG TTTTTAGCTG CGAGTTCAGT CTCCACTGCT TTGGGCTGTC CTGATATTTC TATGGTCCGC AAAGGGGCAC 


2561 


Mm *>»MaM>«ji«AiB rfM«MMji»MMMMH #«^wMi/*(fMnr*^^ It /^/«/*<i*rv<f>r*/**i* TBrw*^&T&^ ^TffT<*^fl^^T TT^*PO(*CTTC GGGAA.GCGTG 
GAAGCTCCCT CG^jCGCTCT CCTGTTwCGA CCCTQCCGCT tauvajw^taVip Wiwi*»vwv*wi * n-iv-w-t as- ^J^J^Jfw\«^s»*vJ 

CTTCGAGGGA GCACGCGAGA GGACAAGGCT GGGACGGCGA ATGGCCTATG GACAGGCGGA AAGAGGGAAG CCCTTCGCAC 


2641 


GCGCTTTCTC AATGCTCACG CTGTAGGTAT CTCAGTTCGG TGTAGGTCGT TCGCTCCAAG CTGGGCTGTG TGCACGAACC 

cgcgaaagag TTACGAGTGC gacatccata gagtcaagcc acatccagca agcgaggttc gacccgacac acgtgcttgg 


2721 


CCCCGTTCAG CCCGACCCCT CCGCCTTATC CGGTAACTAT CGTCTTGAGT CCAACCCGGT AAGACACGAC TTATCGCCAC 
GGGCCAAGTC GGGCTGGCGA CGCGGAATAG GCCATTGATA GCAGAACTCA GGTTGGGCCA TTCTGTGCTG AATAGCGGTG 


2801 


TGGCAGCAGC CACTGGTAAC AGGATTAGCA GAGCGAGGTA TGTAGGCGGT GCTACAGAGT TCTTGAAGTG GTGGCCTAAC 
ACCGTCGTCG GTGACCATTG TCCTAATCGT CTCQCTCCAT ACATCCGCCA CGATGTCTCA AGAACTTCAC CACCGGATTG 


2881 


TACGGCTACA CTAGAAGGAC AGTATTTGGT ATCTGCGCTC TGCTGAAGCC AGTTACCTTC GGAAAAAGAG JTGGTAGCTC 
ATGCCGATGT GATCTTCCTC TCATAAACCA TAGACGCGAG ACGACTTCGG TCAATGGAAG CCTTTTTCTC AACCATCGAG 
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TTfz^*Trrfzezr AAar&A&rrA rrcrTGGTAG CGGTGGTTTT TTTGTTTGCA AGCAGCAGAT TACGCGCAGA AAAAAAGGAT 

1 4%jMiV^wN>^ iww\^*wM*^^rt ^wV^iVW***** & A A k & * nwwnw^»nwn* »n>wwN«wN»nwn fVWWwvluum 

AACTAGGCCG TTTGTTTGGT GGCGACCATC GCCACCAAAA AAACAAACGT TCGTCGTCTA ATGCGCGTCT TTTTTTCCTA 




rrr'a 11 r- IV a /~n Tr^r^TTr iTr* TTTTr^APGG GGTCTGACGC TCAGTGGAAC GAAAACTCAC GTTAAGGGAT TTTfiHTCATG 
GAGTTCTTCT AGGAAACTAG AAAAGATGOC CCAGACTGCG AGTCACCTTG CTTTTGAGTG CAATTCCCTA AAACCAGTAC 


"1191 


artaTTATraa aaafv:aT<*TT rarrTArtRTr TTTTTAAATT jlAAAATGAAG TTTTAAATCA ATCTAAAGTA TATaTGAf:TA 
TCTAATAGTT TTTCCTAGAA GTGGATCTAG GAAAATTTAA TTTTTACTTC AAAATTTAGT TAGATTTCAT ATATACTCAT 


3201 


TTGAACCAGA CTGTCAATGG TTACGAATTA GTCACTCCGT GGATAGAGTC GCTAGACAGA TAAAGCAAGT AGGTATCAAC 


3281 


CCtGACTCCC CGTCGTGTAG ATAACTACGA TACGvjGAvjvj^j uiiAu^Ai^i LiV9W^.v«wi\^iu i \j*-m/\ i n^v^viuvji\uAC 
GGACTGAGGG GCAGCACATC TATTGATGCT ATGCCCTCCC GAATGGTAGA CCGGGGTCAC GACGTTACTA TGGCGCTCTG 


3361 


CCACGCrCAC CGGCTCCAGA TTTATCAGCA ATAAAUv-AvjrC Lj<jUv-ijAvjL.ij<- klimm^j i is^j i ^ ^ikjt^AAc i i l 
GGIGCGAGTG GCCGAGGTCT AAATAGTCGT TATTTGGTCG GTCGGCCTTC CCGGCTCGCG TCTTCACCAG GACGTTGAAA 


3441 


ATCCGCCTCC ATCCAGTCTA TTAATTGTTG CCGt>UAAvjV# i i Aftva i A V3i i\.(j^^.Av9i iAMiAvjiLio v^vjCA/tv^oi iLr 
TAGGCGGAGG TAGGTCAGAT AATTAACAAC GGCCCTTCGA TCTCATTCAT CAAGCGGTCA ATTATCAAAC GCGTTGCAAC 


3321 


Mi&^«^^#*Afti^ j«fii/*^«f«#"»n/»ii/^ r'nt*f*r^r*fVP'P Tf^T^ffl/lf^T Tf^HTT^A/VT f'P^VITT^'r'dl A^/^ZVTmA/VI 

TTGCCATTGC TACAGGCATC GTGGTGTvAC OCTCoTwwi * iwjAftiviVawi jlv«ai AV»A\>%»i \»uv*w i MvvvnL^AAuu 
AACGGTAACG ATGTCCGTAG CACCACAGTG CGAQCAGCAA ACCATACCGA AGIAAGTCGA GGCCAAGGGT TGCTAGTTCC 


3601 


CGAGTTACAT GATCCCCCAT GTTGTGCAAA AAAGCGGTTA (jCTL.L.i iv-uia ix.L.icv.idiAiL. itaiw^A^sMM lA/^kJi lv^u 
GCTCAATGTA CTAGGGGGTA CAACAC6TTT TTTCGCCAAT CGAGGAAGCC AGGAGGCTAG CAACAGTCTT CATTCAACCG 


3661 


CGCAGTGTTA TCACTCATGG TTATGGCAGC ACTGCATAAT iVriuriAvlv i.\.Ai\*v»UKH- K.\jnMvai\L<^ 
GCGTCACAAT AGTGAGTACC AATACCGTCG TGACGTATTA AGAGAATGAC AG7ACGGTAG GCATTCTACG AAAAGACACT 


3761 


CPQGTGAGTA CTCAACCAAG TCATTCTGAG AATAGTGTAT o^V30v^va^v«\«^o hvjI ivj^-tv-i a ^^ui^ijui^t^xt. MMii^v^vjuvjMt 
GACCACrCAT 6AGTTGGTTC AGTAAGACTC TTATCACATA CGCCGCTCGC TCAACGAGAA CGGGCCGCAG TTATGCCCTA 


3841 


AATACCGCGC CACATAGCAG AACTTTAAAA GTTGCTwVll-A i Li3WUU\AV.u a i i^«wjvjvj v,v»«uww\vx^a v^/w^un i v- l * 
TTATGGCGCG GTGTATCGTC TTGAAATTTT CACGAGTAGT AACCTTTTGC AAGAAGCCCC GCTTTTGAGA GTTCCTAGAA 


3921 


ACCGCTGTTG AGATCCAGTT CGATGTAACC CACTCGTGCA CCCAACTGAT CTTCAGCATC TTTTACTTTC ACCAGCGTTT 
TGGCGACAAC TCTAGGTCAA GCTACATTGG GTGAGCACGT GGGTTGACTA GAAGTCGTAG AAAATGAAAG TGGTCGCAAA 


4001 


CTGGGTGAGC AAAAACAGGA AGGCAAAATG CCGCAAAAAA GGGAATAAGG CCGACACGGA AATGTTGAAT ACTCATACTC 
GACCCACTCG TTTTTGTCCT PCCCTTTTAC GGCCTTTTTT CCCTTATTCC CGCTGTGCCT TTACAACTTA TGAGTATGAG 


4081 


TTCCTTTTTC AATATTATTG AAGCATTTAT CAGGGTTATT GTCTCATGAG CGGATACATA TTTGAATGTA TTTAGAAAAA 
AAGGAAAAAG TTATAATAAC TTCGTAAATA GTCCCAATAA CAGAGTACTC GCCTATGTAT AAACTTACAT AAATCTTTTT 


4161 


TAAACAAATA GGGGTTCCGC GCACATTTCC CCGAAAAGTG CCACCTGACG TCTAAGAAAC CATTATTATC ATGACATTAA 
ATTTGTTTAT CCCCAAGGCG CGT6TAAAGG GGCTTTTCAC GGTGGACTGC AGATTCTTTG GTAATAATAG TACTGTAATT 


4241 


CCTATAAAAA TAGGCGTATC ACGAGGCCCT TTCGTC 
GGATATTTTT ATCCGCATAG TGCTCCGGGA AAGCAG 
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TCGCGCGTTT 
AGCGCGCAAA 


CCGTGATGAC GGTGAAAACC TCTGACACAT 
GCCACTACTG CCACTTTTGG AGACTGTGTA 


GCAGCTCCCG 
CGTCGAGGGC 


51 


GAGACGGTCA 
CTCTGCCAGT 


CAGCTTGTCT GTAAGCGGAT GCCGGGAGCA 
GTCGAACAGA CATTCGCCTA CGGCCCTCGT 


GACAAGCCCG 
CTGTTCGGGC 


101 


TCAGGGCGCG 
AGTCCCGCGC 


TCAGCGGGTG TTGGCGGGTG TCGGGGCTGG 
AGTCGCCCAC AACCGCCCAC AQCCCCGACC 


CTTAACTATG 
GAATTGATAC 


151 


CGGCATCAGA 
GCCGTAGTCT 


GCAGATTGTA CTGAGAGTGC ACCATATGAA 
CGTCTAACAT GACTCTCACG TGGTATACTT 


GCTTTTTGCA 
CGAAAAACGT 




StuI 




201 


AAAGCCTAGG 
TTTCGGATCC 


CCTCCAAAAA AGCCTCCTCA CTACTTCTGG 
GGAGGTTTTT TCGGAGGAGT GATGAAGACC 


AATAGCTCAG 
TTATCGAGTC 


251 


AGGCCGAGGC 
TCCGGCrCCG 


GGCCTCGGCC TCTGCATAAA TAAAAAAAAT 
CCCGkGCCGG AGACGTATTT ATTTTTTTTA 


TAGTCAGCCA 
ATCAGTCGGr 


301 


TGGGGCGGAG 
ACCCCGCCTC 


AATGGGCGGA ACTGGGCGGG GAGGGAATTA 
TTACCCGCCT TGACCCGCCC CTCCCTTAAT 


TTGGCTATTG 
AACCGATAAC 




GCCATTGCAT 
CGGTAACGTA 


ACGTTGTATC TATATCATAA TATGTACATT 
TGCAACATAG ATATAGTATT ATACATGTAA 


TATATTGGCT 
ATATAACCGA 




CATGTCCAAT 
GTACAGGTTA 


ATGACCGCCA TGTTGACATT GATTATTGAC TA6TTATTAA 
TACTGGCGGT ACAACTGTAA CTAATAACT6 ATCAATAATT 




TAGTAATCAA 
ATCATTAGTT 


TTACGGGGTC ATTAGTTCAT AGCCCATATA 
AATGCCCCAG TAATCAAGTA TCGGGTATAT 


TGGAGTTCCG 
ACCTCAAGGC 




CGTTACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG CCCAACCACC 
GCAATGTATT GAATGCCATT TACCGGGCGG ACCGACTGGC GGGTTGCTGG 


551 


CCCGCCCATT 
GGGCGGGTAA 


GACGTCAATA ATGACGTATG TTCCCATAGT AACCCCAATA 
CTGCAGTTAT TACTGCATAC AAGGGTATCA TTGCGGTTAT 


601 


GGGACTTTCC ATTGACGTCA ATGG6TGGAG TATTTAC6GT 
CCCTGAAAGG TAACTGCAGT TACCCACCTC ATAAATGCCA 


AAACTGCCCA 
TTTGACGGGT 


651 


CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTCCGCCC 
GAACCGTCAT GTAGTTCACA TAGTATACGG TTCAGGCGGG 


CCTATTGACG 
GGATAACTGC 


701 


TCAATGACGG TAAATGGCCC GCCTGGCATT ATGCCCAGTA CATGACCTTA 
AGTTACtGCC ATTTACCGGG CGGACCGTAA TACGGGTCAT GTACTGGAAT 


751 


CGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA TCGCTATTAC 
GCCCTGAAAG GATGAACCGT CATGTAGATG CATAATCAGT AGCGATAATG 


801 


CATGGTGATG CGGTTTTGGC AGTACACCAA TGGGCGTGGA 
GTACCACTAC GCCAAAACCG TCATGTGGTT ACCCGCACCT 


TAGCGGTTTG 
ATCGCCAAAC 


851 


ACTCACGGGG 
TGAGTGCCCC 


ATTTCCAAGT CTCCACCCCA TTGACGTCAA 
TAAAGCTTCA 6AGGTGGGGT AACTGCAGTT 


TGGGAGTTTG 
ACCCTCAAAC 
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901 


TTTTGGCACC AAAATCAACG 
AAAACCGTGG TTTTAGTTGC 


GGACTTTCCA 
CCTGAAAGGT 


AAATGTCGTA 
TTTACAGCAT 


ATAACCCCGC 
TATTGGGGCG 


951 


CCCGTTGACG CAAATGGGCG 
GGGCAACTGC GTTTACCCGC 


GTAGGCGTGT 
CATCCGCACA 


ACGGTGGGAG 
TGCCACCCTC 


GTCTATATAA 
CAGATATATT 


1001 


GCAGAGCTCG TTTAGTGAAC 
CGTCTCGAGC AAATCACTTG 


CGTCAGATCG 
GCAGTCTAGC 


CCTGGAGACG 
GGACCTCTGC 


CCATCCACGC 
GGTAGGTGCG 


1051 


TGTTTTGACC TCCATAGAAG 
ACAAAACTGG AGGTATCTTC 


ACACCGGGAC 
TGTGGCCCTG 


CGATCCAGCC 
GCTAGGTCGG 


TCCGCGGCCG 
AGGCGCCGGC 


1101 


GGAACGGTGC ATTGGAACGC 
CCTTGCCACG TAACCTTGCG 


GGATTCCCCG 
CCTAAGGGGC 


TGCCAAGAGT 
ACGGTTCTCA 


GACGTAAGTA 
CTGCATTCAT 


1151 


CCGCCTATAG ACTCTATAGG 
GGCGGATATC TGAGATATCC 


CACACCCCTT 
GTGTGGGGAA 


TGGCTCTTAT 
ACCGAGAATA 


GCATGCTATA 
CGTACGATAT 


1201 


CTGTTTTTGG CTTGGGGCCT ATACACCCCC GCTCCTTATG CTATAGGTGA 
GACAAAAACC GAACCCCGGA TATGTGGGGG CGAGGAATAC GATATCCACT . 


1251 


TGGTATAGCT TAGCCTATAG 
ACCATATCGA ATCGGATATC 


GTGTCGGTTA 
CACACCCAAT 


TTGACCATTA 
AACTGGTAAT 


TTGACCACTC 
AACTGGTGAG 


1301 


CCCTATTGGT GACGATACTT 
GGGATAACCA. CTGCTATGAA 


TCCATTACTA 
AGGTAATGAT 


ATCCATAACA 
TAGGTATTGT 


TGGCTCTTTG 
ACCGAGAAAC 


1351 


CCACAACTAT CTCTATTGGC 
GGTGTTGATA GAGATAACCG 


TATATGCCAA 
ATATACGGTT 


TACTCTGTCC 
ATGAGACAGG 


TTCAGAGACT 
AAGTCTCTGA 


1401 


GACACGGACT CTGTATTTTT 
CTGTGCCTGA GACATAAAAA 


ACAGGATGGG 
TGTCCTACCC 


GTCCATTTAT 
CAGGTAAATA 


TATTTACAAA 
ATAAATGTTT 


1451 


TTCACATATA CAACAACGCC GTCCCCCGTG CCCGCAGTTT TTATTAAACA 
AAGTGTATAT GTTGTTGCGG CAGGGGGCAC GGGCGTCAAA AATAATTTGT 


1501 


TAGC6TGGGA TCTCCGACAT 
ATCGCACCCT A6AGGCTGTA 


CTCGGGTACG 
GAGCCCATGC 


T6TTCCGGAC 
ACAAGGCCTG 


ATGGGCTCTT 
TACCCGAGAA 


1551 


CTCCGGTAGC GGCGGAGCTT 
GAGGCCATCG CCGCCTCGAA 


CCACATCCGA 
GGTGTAGGCT 


GCCCTGGTCC 
CGGGACCAGG 


CATCCGTCCA 
GTAGGCAGGT 


1601 


GCGGCTCATG GTCGCTCGGC AGCTCCTTGC TCCTAACAGT GGAGGCCAGA 
CGCCGAGTAC CAGCGAGCCG TCGAGGAACG AGGATTGTCA CCTCCGGTCT 


1651 


CTTAGGCACA GCACAATGCC 
GAATCCGTGT CGTGTTACGG 


CACCACCACC AGTGTGCCGC ACAAGGCCGT 
GTGGTGGTGG TCACACGGCG TGTTCCGGCA 


1701 


GGCGGTAGGG TATGTGTCTG AAAATGAGCT CGGAGATTGG GCTCGCACCT 
CCGCCATCCC ATACACAGAC TTTTACTCGA GCCTCTAACC CGAGCGTGGA 


1751 


GGACGCAGAT GGAAGACTTA 
CCTGCGTCTA CCTTCTGAAT 


AGGCAGCGGC 
TCCGTCCCCG 


AGAAGAAGAT 
TCTTCTTCTA 


GCAGGCAGCT 
CGTCCGTCGA 


leoi 


GAGTTGTTGT ATTCTGATAA GAGTCAGAGG TAACTCCCGT TGCGGTGCTG 
CTCAACAACA TAAGACTATT CTCAGTCTCC ATT6AGGGCA ACGCCACGAC 
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1851 TTAACGGTGG AGGGCAGTGT AGTCTGAGCA GTACTCGTTG CTGCCGCGCG 
AATTGCCACC TCCCGTCACA TCAGACTCGT CATGAGCAAC GACGGCGCGC 



1901 CGCCACCAGA CATAATAGCT GACAGACTAA CAGACTGTTC CTTTCCATGG 
GCGGTGGTCT GTATTATCGA CTGTCTGATT GTCTGACAAG GAAAGGTACC 



*2 MAP 

EcoRI 

1951 GTCTTTTCTG cagtcaccgt CGTCGACCTA AGAATTCACC ATGGCGCCCA 

CAGAAAAGAC GTCAGTGGCA GCAGCTGGAT TCTTAAGTGG taccgcgggt 



+2ITAY AQO TRGL LGC IIT 
2001 TCACGGCGTA CGCCCAGCAG ACAAGGGGCC TCCTAGGGTG CATAATCACC 
AGTGCCGCAT GCGGGTCGTC TGTTCCCCGG AGGATCCCAC GTATTAGTGG 



+2SLTG ROK MQV EGEV QIV 
2051 AGCCTAACTG GCCGGGACAA AAACCAAGTG GAGGGTGAGG TCCAGATTGT 
TCGGATTGAC CGGCCCTGTT TTTGGTTCAC CTCCCACTCC AGGTCTAACA 



+2 STA AQTF LAT CIN GVC 
2101 GTCAACTGCT GCCCAAACCT TCCTGGCAAC GTGCATCAAT GGGGTGTGCT 
CAGTTGAGGA CGGGTTTGGA AGGACCGTTG CACGTAGTTA CCCCACACGA 



•^2WTVY HGA GTRT IAS PKG 

2151 GGACTGrCTA CC^CGGGCCC GGPJKCGAGGk CCkTCGCGTC ACCCAAGGGT 
CCTGACAGAT GGTGCCCCGG CCTTGCTCCT GGTAGCGCAG TGGGTTCCCA 



-^2PVIQ MYT NVD QDLV GWP 
22C1 CCTGTCATCC AGATGTATAC CAATGTAGAC CAAGACCTTG TGGGCTGGCC 
GGACAGTAGG TCTACATATG GTTACATCTG GTTCTGGAAC ACCCGACCGG 



+2 ASQ GTRS LTP CTC GSS 
2251 CGCTTCGCAA GGTACCCGCT CATTGACACC CTGCACTTGC GGCTCCTCGG 
GCGAAGCGTT CCATGGGCGA GTAACTGTGG GACGTCAACC CCGAGGAGCC 



*2DLyL VTR HADV IPV RRK 
2301 ACCTTTACCT GGTCACGAGG CACGCCGATG TCATTCCCGT GC6CCGGCGG 
TGGAAATGGA CCAGTGCTCC GTGCGGCTAC A6TAAGGGCA CGCGGCCGCC 



>2GDSR GSL LSP RPIS YLK 
2351 GGTGATAGCA GGGGCAGCCT GCTGTCGCCC CGGCCCATTT CCTACTTGAA 
CCACTATCGT CCCCGTCGGA CGACAGCGGG GCCGGGTAAA GGATGAACYT 



+ 2 GSS GGPL LCP AGH AVG 
2401 AGGCTCCTCG GGGGGTCCGC TGTTGTGCCC CGCGGGGCAC GCCGTGGGCA 
TCCGAGGAGC CCCCCAGGCG ACAACACGGG GCGCCCCGTG CGGCACCCGT 



+2IFRA AVC TRGV AKA VDF 
24 51 TATTTAGGGC CGCGGTGTGC ACCCGTGGAG TGGCTAAGGC GGTGGACTTT 
ATAAATCCCG GCGCCACACG TGGGCACCTC ACCGATTCCG CCACCTGAAA 



+2rPVE NL£ TTM RSPV FTD 
2501 ATCCCTGTGG AGAACCTAGA GACAACCATG AGGTCCCCGG TGTTCACGGA 
TAGGGACACC TCTTGGATCT CTGTTGGTAC TCCAGGGGCC ACAAGTGCCT 
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+ 2 NSS PPVV PQS FQV AHL 
2551 TAACTCCTCT CCACCAGTAG TGCCCCAGAG CTTCCAGGTG GCTCACCTCC 
ATTGAGGAGA GGTGGTCATC ACGGGGTCTC GAAGGTCCAC CGAGTGGAGG 



+ 2HAPT GSG KSTK VPA AYA 
2601 ATGCTCCCAC AGGCAGCGGC AAAAGCACCA AGGTCCCGGC TGCATATGCA 
TACGAGGGTG TCCGTCGCCG TTTTCGTGGT TCCAGGGCCG ACGTATACGT 



+ 2AQG.Y KVL VLN PSVA ATL 
2651 GCTCAGGGCT ATAAGGTGCT AGTACTCAAC CCCTCTGTTG CTGCAACACT 
CGAGTCCCGA TATTCCACGA TCATGAGTTG GGGAGACAAC GACGTTGTGA 



+ 2 GFG AYMS KAH GID PNI 
2701 GGGCTTTGGT GCTTACATGT CCAAGGCTCA TGGGATCGAT CCTAACATCA 
CCCGAAACCA CGAATGTACA GGTTCCGAGT ACCCTAGCTA GGATTGTAGT 



+ 2RTGV RTI TTGS PIT YST 
2751 GGACCGGGGT GAGAACAATT ACCACTGGCA GCCCCATCAC GTACTCCACC 
CCTGGCCCCA CTCTTGTTAA TGGTGACCGT CGGGGTAGTG CATGAGGTGG 



+2YGKF LAD GGC SGGA YDI 
2801 TACGGCAAGT TCCTTGCCGA CGGCGGGTGC TCGGGGGGCG CTTATGACAT 
ATGCCGTTCA AGGAACGGCT GCCGCCCACG AGCCCCCCGC GAATACTGTA 



+ 2 lie DECH STD ATS ILG 
2851 AATAATTTGT GACGAGTGCC ACTCCACGGA TGCCACATCC ATCTTGGGCA 
TTATTAAACA CTGCTCACGG TGAGGTGCCT ACGGTGTAGG TAGAACCCGT 



+ 2.1 GTV LDQ AETA GAR LVV 
2901 TTGGCACTGT CCTTGACCAA GCAGAGACTG CGGGGGCGAG ACTGGTTGTG 
AACCGTGACA GGAACTGGTT C6TCTCTGAC GCCCCCGCTC TGACCAACAC 



+ 2 LATA TPP GSV TVPH PNI 
2951 CTCGCCACCG CCACCCCTCC GGGCTCCGTC ACTGTGCCCC ATCCCAACAT 
GAGCGGTGGC GGTGGGGAGG CCCGAGGCAG TGACACGGGG TAGGGTTGTA 



+ 2 EEV ALST TGE IPF YGK 
3001 CGAGGAGGTT GCTCTGTCCA CCACCGGAGA 6ATCCCTTTT TACGGCAAGG 
GCTCCTCCAA CGAGACAGGT GGTGGCCTCT CTAGGGAAAA ATGCCGTTCC 



+2AIPL SVI KGGR HLI FCH 
3051 CTATCCCCCT CGAAGTAATC AAGGGGGGGA 6ACATCTCAT CTTCTGTCAT 
GATAGGGGGA GCTTCATTAG TTCCCCCCCT CTGTAGAGTA GAAGACAGTA 



♦2SKKK CDE LAA KLVA LGI 
3101 TCAAAGAAGA AGTGCGACGA ACTCGCCGCA AAGCTGGTCG CATTGGGCAT 
AGTTTCTTCT TCACGCTGCT TGAGCGGCGT TTCGACCAGC GTAACCCGTA 



+2 NAV AYYR GLD VSV IPT 
3151 CAATGCCGTG GCCTACTACC GCGGTCTTGA CGTGTCCGTC ATCCCGACCA 
GTTACGGCAC CGGATGATGG CGCCAGAACT GCACAGGCAG TAGGGCIGGT 



+2SGDV VVV ATDA LMT GY^T 
3201 GCGGCGATGT TGTCGTCGTG GCAACCGATG CCCTCATGAC CGGCTATACC 
CGCCGCTACA ACAGCAGCAC CGTTGGCTAC GGGAGTACTG GCCGATATGG 
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+2 
3251 


GDFD SVI DCN TCVT QTV 
GGCGACTTCG ACTCGGTGAT AGACTGCAAT ACGTGTGTCA CCCAGACACT 
CCGCTGAAGC TGAGCCACTA TCTGACGTTA TGCACACAGT GGGTCTGTCA 


+2 
3301 


DFSLDPTFTIETITLP 
CGATTTCAGC CTTGACCCTA CCTTCACCAT TGAGACAATC ACGCTCCCCC 
GCTAAAGTCG GAACTGGGAT GGAAGTGGTA ACTCTGTTAG TGCGAGGGGG 


+2 

3351 


QDAV SRT QRRG RTG RGK 
AAGATGCTGT CTCCCGCACT CAACGTCGGG GCAGGACTGG CAGGGGGAAG 
TTCTACGACA GAGGGCGTGA GTTGCAGCCC CGTCCTGACC GTCCCCTTTC 


+ 2 
3401 


PGIY RFV APG ERPS GMF 
CCAGGCATCT ACAGATTTGT GGCACCGGGG GAGCGCCCCT CCGGCATGTT 
GGTCCGTAGA TGTCTAAACA CCGTGGCCCC CTCGCGGGGA GGCCGTACAA 


+ 2 
3451 


DSS VLCE CYD AGC AWY 
CGACTCGTCC GTCCTCTGTG AGTGCTATGA CGCAGGCTGT GCTTGGTATG 
GCTGAGCAGG CAGGAGACAC TCACGATACT GCGTCCGACA CGAACCATAC 


+ 2 
3S01 


ELTP AET TVRL RAY MNT 
AGCTCACGCC CGCCGAGACT ACAGTTAGGC TACGAGCGTA CATGAACACC 
TCGAGTGCGG GCGGCTCTGA TGTCAATCCG ATGCTCGCAT GTACTTGTGG 


+2 
3551 


PGLP VCQ DHL EFWE GVF 
CCGGGGCTTC CCGTGTGCCA GGACCATCTT GAATTTTGGG AGGGC6TCTT 
GGCCCCGAAG GGCACACGGT CCTGGTAGAA CTTAAAACCC TCCCGCAGAA 


+2 


TGL THID AHF LSQ TKQ 
StUi 


3601 


TACAGGCCTC ACTCATATAG ATGCCCACTT TCTATCCCAG ACAAAGCAGA 
ATGTCCGGAG TGAGTATATC TACGGGTGAA AGATAGGGTC TGTTTCGTCT 


+2 
3651 


SGENLPYLVAYQATVCA 
GTGGGGAGAA CCTTCCTTAC CTGGTAGCGT ACCAAGCCAC CGTGTGCGCT 
CACCCCTCTT GGAAGGAATG GACCATCGCA TGGTTCGGTG GCACACGCGA 


+2 
3701 


RAQA PPP SWD QMWK CLI 
AGGGCTCAAG CCCCTCCCCC ATCGTGGGAC CAGATGTGGA AGTGTTTGAT 
TCCCGAGTTC GGGGAGGGGG TAGCACCCTG GTCTACACCT TCACAAACTA 


+2 
3751 


RLK PTLH GPT PLL YRL 
TCGCCTCAAG CCCACCCTCC ATGGGCCAAC ACCCCTGCTA TACAGACTGG 
AGCGGAGTTC GCCTGCGAGG TACCCGGTTG TGGGGACGAT ATGTCTGACC 


+2 
3801 


GAVQNEITLTHPVTKYI 
GCGCTGTTCA GAATGAAATC ACCCTGACGC ACCCAGTCAC CAAATACATC 
CGCGACAAGT CTTACTTTAG TGGGACTGCG TGGGTCAGTG GTTTATGTAG 


+2 
3851 


MTCM SAD LEV VTST WVL 
ATGACATGCA TGTCGGCCGA CCTGGAGGTC GTCACGAGCA CCTGGGTGCT 
TACTGTACGT ACAGCCGGCT GGACCTCCAG CACTGCTCGT GGACCCACCA 


*2 
3901 


VGG VLAA LAA YCL STG 
CGTTGGCGGC GTCCTGGCTG CTTTGGCCGC GTATTGCCTG TCAACAGGCT 
GCAACCGCCG CAGGACCGAC GAAACCGGCG CATAACGGAC AGTTGTCCGA 
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+2CVV I VGR VVLS GKP All 
3951 GCGTGGTCAT AGTGGGCAGG GTCGTCTTGT CCGGGAAGCC GGCAATCATA 
CGCACCAGTA TCACCCGTCC CAGCAGAACA GGCCCTTCGG CCGTTAGTAT 



+2PDRE VLY REF DEME EC 
4001 CCTGACAGGG AAGTCCTCTA CCGAGAGTTC GATGAGATGG AAGAGTGCTA 
GGACTGTCCC TTCAGGAGAT GGCTCTCAAG CTACTCTACC TTCTCACGAT 



BamHI . Mlul 



4051 GGATCCACTA CGCGTTAGAG CTCGCTGATC AGCCTCGACT GTGCCTTCTA 
CCTAGGTGAT 6CGCAATCTC GAGCGACTAG TCGGAGCTGA CACGGAAGAT 



4101 GTTGCCAGCC ATCTGTTGTT TGCCCCTCCC CCGTGCCTTC CTTGACCCTG 
CAACGGTCGG 7AGACAACAA ACGGGGAGGG GGCACGGAAG GAACTGGGAC 



4151 GAAGGTGCCA CTCCCACTGT CCTTTCCTAA TAAAATGAGG AAATTGCATC 
CTTCCACGGT GAGGGTGACA GGAAAGGATT ATTTTACTCC TTTAACGTAG 



4201 GCATTGTCTG AGTAGGTGTC ATTCTATTCT GGGGGGTGGG GTGGGGCAGG 
CGTAACAGAC TCATCCACAG TAAGATAAGA CCCCCCACCC CACCCCGTCC 



4251 ACAGCAAGGG GGAGGATTGG GAAGACAATA GCAGGCATGC TGGGGAGCTC 
TGTCGTTCCC CCTCCTAACC CTTCTGTTAT CGTCCGTACG ACCCCTCGAG 



4301 TTCCGCTTCC TCGCTCACTG ACTCGCTGCG CTCGGTCGTT CGGCTGCGGC 
AAGGCGAAGG AGCGAGTGAC TGAGCGACGC GAGCCAGCAA GCCGACGCCG 



4351 GAGCGGTATC AGCTCACTCA AAGGCGGTAA TACGGTTATC CACAGAATCA 
CTCGCCATAG TCGAGTGAGT TTCCGCCATT ATGCCAATAG GTGTCTTAGT 



4401 GGGGATAACG CAGGAAAGAA CATGTGAGCA AAAGGCCAGC AAAAGGCCAG 
CCCCTATTGC GTCCTTTCTT GTACACTCGT TTTCCGGTCG TTTTCCGGTC 



4451 GAACCGTAAA AAGGCCGCGT TGCTGGCGTT TTTCCATAGG CTCCGCCCCC 
CTTGGCATTT TTCCGGCGCA ACGACCGCAA AAAGGTATCC GAGGCGGGGG 



4501 CTGACGAGCA TCACAAAAAT CGACGCTCAA GTCAGAGGTG GCGAAACCCG 
GACTGCTCGT AGTGTTTTTA GCTGCGAGTT CAGTCTCCAC CGCTTTGGGC 



4551 ACAGGACTAT AAAGATACCA GGCGTTTCCC CCTGGAAGCT CCCTCGT6CG 
TGTCCTGATA TTTCTATGGT CCGCAAAGGG GGACCTTCGA GGGAGCACGC 



4601 CTCTCCTGTT CCGACCCTGC CGCTTACCGG ATACCTGTCC GCCTTTCTCC 
GAGAGGACAA GGCTGGGACG GCGAATGGCC TATGGACAGG CGGAAAGAGG 



4651 CTTCGGGAAG CGTGGCGCTT TCTCAATGCT CACGCTGTAG GTATCTCAGT 
GAAGCCCTTC GCACCGCGAA AGAGTTACGA GTGCGACATC CATAGAGTCA 



4701 TCGGTGTAGG TCGTTCGCTC CAAGCTGGGC TGTGTGCACG AACCCCCCGT 
AGCCACATCC AGCAAGCGAG GTTCGACCCG ACACACGTGC TT6GGGGGCA 



4751 TCAGCCCGAC CGCTGCGCCT TATCCGGTAA CTATCGTCTT GAGTCCAACC 
AGTCGGGCTG GCGACGCGGA AtAGGCCATT GATAGCAGAA CTCAGGTTGG 



4801 CGGTAAGACA CGACTTATCG CCACTGGCAG CAGCCACTGG TAACAGGATT 
GCCATTCTGT GCTGAATAGC 6GTGACCGTC GTCGGTGACC ATTGTCCTAA 
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4851 


AGCAGAGCGA GGTATGTAGG CGGTGCTACA GAGTTCTTGA AGTGGTGGCC 
TCGTCTCGCT CCATACATCC GCCACGATGT CTCAAGAACT TCACCACCGG 


4901 


TAACTACGGC 
ATTGATGCCG 


TACACTAGAA 
ATGTGATCTT 


GGACAGTATT TGGTATCTGC GCTCTGCTGA 
CCTGTCATAA ACCATAGACG CGAGACGACT 


4951 


AGCCAGTTAC 
TCGGTCAATG 


CTTCGGAAAA 
GAAGCCTTTT 


AGAGTTGGTA GCTCTTGATC CGGCAAACAA 
TCTCAACCAT CGAGAACTAG GCCGTTTGTT 


5001 


ACCACCGCTG 
TGGTGGCGAC 


GTAGCGGTGG 
CATCGCCACC 


TTTTTTTGTT TGCAAGCAGC AGATTACGCG 
AAAAAAACAA ACGTTCGTCG TCTAATGCGC 


5051 


CAGAAAAAAA 
GTCTTTTTTT 


GGATCTCAAG 
CCTAGAGTTC 


AAGATCCTTT GATCTTTTCT ACGGGGTCTG 
TTCTAGGAAA CTAGAAAAGA TGCCCCAGAC 


5101 


ACGCTCAGTG GAACGAAAAC TCACGTTAAG GGATTTTGGT CATGAGATTA 
TGCGAGTCAC CTTGCTTTTG AGTGCAATTC CCTAAAACCA GTACTCTAAT 


5151 


TCAAAAAGGA 
AGTTTTTCCT 


TCTTCACCTA 
AGAAGTGGAT 


GATCCTTTTA AATTAAAAAT GAAGTTTTAA 
CTAGGAAAAT TTAATTTTTA CTTCAAAATT 


5201 


ATCAATCTAA 
TAGTTAGATT 


AGTATATATG 
TCATATATAC 


AGTAAACTTG GTCTGACAGT TACCAATGCT 
TCATTTGAAC CAGACTGTCA ATGGTTACGA 


5251 


TAATCAGTGA GGCACCTATC TCAGCGATCT GTCTATTTCG TTCATCCATA 
ATTAGTCACT, CCGTGGATAG AGTCGCTAGA CAGATAAAGC AAGTAGGTAT 


5301 


GTTGCCTGAC 
CAACGGACTG 


TCCCCGTCGT 
AGGGGCAGCA 


GTAGATAACT ACGATACGGG AGGGCTTACC 
CATCTATT6A TGCTATGCCC TCCCGAATGG 


5351 


ATCTGGCCCC 
TAGACCGGGG 


AGTGCTGCAA 
TCACGACGTT 


TGATACCGCG AGACCCACGC TCACCGGCTC 
ACTATGGCGC TCTGGGTGCG AGTGGCCGAG 


5401 


CAGATTTATC 
GTCTAAATAG 


AGCAATAAAC 
TCGTTATTTG 


CAGCCAGCCG GAAGGGCCGA GCGCAGAAGT 
GTCGGTCGGC CTTCCCGGCT CGCGTCTTCA 


5451 


GGTCCTGCAA 
CCAGGACGTT 


CTTTATCCGC 
GAAATAGGCG 


CTCCATCCAG TCTATTAATT GTTGCCGGGA 
GAGGTAGGTC AGATAATTAA CAACGGCCCT 


5501 


AGCTAGAGTA AGTAGTTCGC CAGTTAATAG TTTGCGCAAC GTTGTTGCCA 
TCGATCTCAT TCATCAAGCG GTCAATTATC AAACGCGTTG CAACAACGGT 


5551 


TTGCTACAGG 
AACGATGTCC 


CATCGTGGTG 
GTAGCACCAC 


TCACGCTCGT CGTTTGGTAT GGCTTCATTC 
AGTGCGAGCA GCAAACCATA CCGAAGTAAG 


5601 


AGCTCCGGTT 
TCGAGGCCAA 


CCCAACGATC 
GGGTTGCTAG 


AAGGCGAGTT ACATGATCCC CCATGTTGTG 
TTCCGCTCAA TGTACTAGGG GGTACAACAC 


5651 


CAAAAAAGCG 
GTTTTTTCGC 


GTTAGCTCCT 
CAATCGAGGA 


TCGGTCCTCC GATCGTTGTC AGAAGTAAGT 
AGCCAGGAGG CTAGCAACAG TCTTCATTCA 


5701 


TGGCCGCAGT 
ACCGGCGTCA 


GTTATCACTC 
CAATAGTGAG 


ATGGTTATGG CAGCACTGCA TAATTCTCTT 
TACCAATACC GTCGTGACGT ATTAAGAGAA 


5751 


ACTGTCATGC CATCCGTAAG ATGCTTTTCT GTGACTGGTG AGTACTCAAC 
TGACAGTACG 6TAGGCATTC TACGAAAAGA CACTGACCAC TCATGAGTTG 
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5801 


CAAGTCATTC 
GTTCAGTAAG 


TGAGAATAGT 
ACTCTTATCA 


GTATGCGGCG 
CATACGCCGC 


ACCGAGTTGC 
TGGCTCAACG 


TCTTGCCCGG 
AGAACGGGCC 


5851 


CGTCAATACG 
GCAGTTATGC 


GGATAATACC 
CCTATTATGG 


6CGCCACATA 
CGCGGTGTAT 


GCAGAACTTT 
CGTCTTGAAA 


AAAAGTGCTC 
TTTTCACGAG 


5901 


ATCATTGGAA 

TAGTAACCTT 


AACGTTCTTC 
TTGCAAGAAG 


GGGGCGAAAA 
CCCCGCTTTT 


CTCTCAAGGA 
GAGAGTTCCT 


TCTTACCGCT 
AGAATGGCGA 


5951 


GTTGAGATCC 
CAACTCTAGG 


AGTTCGATGT 
TCAAGCTACA 


AACCCACTCG 
TTGGGTGAGC 


TGCACCCAAC 
ACGTGGGTTG 


TGATCTTCAG 
ACTAGAAGTC 


6001 


CATCTTTTAC 
GTAGAAAATG 


TTTCACCAGC 
AAAGTGGTCG 


GTTTCTGGGT GAGCAAAAAC AGGAAGGCAA 
CAAAGACCCA CTCGTTTTTG TCCTTCCGTT 


6051 


AATGCC6CAA 
TTACGGCGTT 


AAAAGGGAAT 
TTTTCCCTTA 


AAGGGCGACA 
TTCCCGCTGT 


CGGAAATGTT 
GCCTTTACAA 


GAATACTCAT 
CTTATGAGTA 


6101 


ACTCTTCCTT TTTCAATATT ATTGAAGCAT TTATCAGGGT TATTGTCTCA 
TGAGAAGGAA AAAGTTATAA TAACTTCGTA AATAGTCCCA ATAACAGAGT 


6151 


TGAGCGGATA 
ACTCGCCTAT 


CATATTTGAA 
GTATAAACTT 


TGTATTTAGA AAAATAAACA 
ACATAAATCT TTTTATTTGT 


AATAGGGGTT 
TTATCCCCAA 


6201 


CCGCGCACAT 
GGCGCGTGTA 


TTCCCCGAAA 
AAGGGGCTTT 


AGTGCCACCT 
TCACGGTGGA 


GACGTCTAAG 
CTGCAGATTC 


AAACCATTAT 
TTTGGTAATA 


6251 


TATCATGACA 
ATAGTACTGT 


TTAACCTATA 
AATTGGATAT 


AAAATAGGCG 
TTTTATCCGC 


TATCACGAGG 
ATAGTGCTCC 


CCCTTTCGTC 
GGGAAAGCAG 
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MetAlaAlaTyrAlaAlaGlnGlyTyrLysValLeuVal 
2 AGCTTACAAAACAAATTCACCATGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTA 
tCGAATGTTTTGTTTAAGTGGTACCGACGTATACGTCGAGTCCCGATATTCCACGATCAT 

1 HIND3, 21 NCOI, 30 NDEI, 58 SCAI, 

LeuAsnProSerValAlaAlaThrLeuGlyPheGlyAlaTyrMetSerLysAlaHisGly 
62 CTCAACCCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGG 
GAGTTGGGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCC 

IleAspProAsnlleArgThrGlyValArgThrlieThrThrGlySerProIleThrTyr 
122 ATCGATCCTAACATCAGGACCGGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTAC 
TAGCTAGGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATG 

A 

122 CLAI, 

SerThrTyrGlyLysPheLeuAlaAspGlyGlyCysSerGlyGlyAIaTyrAspIlelle 
182 TCCACCTACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCTTATGACATAATA 
AGGTGGATGCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTAT 

IleCysAspGluCysHisSerThrAspAlaThrSerlleLeuGlylleGlyThrValLeu 
242 ATTTGTGACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATTGGCACTGTCCTT 
TAAACACTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAACCGTGACAGGAA 

AspGlnAlaGluThrAlaGlyAlaArgLeuValValLeuAlaThrAlaThrProProGly 
302 GACCAAGCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGC 
CTGGTTCGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCG 

309 ALWNl, 

SerValThrVaiProHisProAsnlleGiuGluValAlaLeuSerThrThrGlyGluIle 
362 TCCGTCACTGTGCCCCATCCCAACATCGAGGAGGTTGCTCTGTCCACC ACCGGAGAGATC 
AGGCAGTGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAG 

ProPheTyrGlyLysAlalleProLeuGluVallleLysGlyGlyArgHisLeuIlePhe 
4 22 CCTTTTTACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTC 
GGAAAAATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAG 

CysHisSerLysLysLysCysAspGluLeuAlaAlaLysLeuValAiaLeuGlylleAsn 
482 TGTCATTCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAAT 
ACAGTAAGTTTCTTCTTCACGCTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTA 

AlaValAlaTyrTyrArgGlyLeuAspValSerVallleProThrSerGlyAspValVal 
542 GCCGTGGCCT ACTACCGCGGTCTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTC 
CGGCACCGGATGATGGCGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAG 

A A 

556 SAC2, 566 DRDl, 

. ValValAlaThrAspAlaLeuMetThrGiyTyrThrGlyAspPheAspSerVallleAsp 
602 GTCGTGGCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGAC 
CAGCACCGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTG 

A 

621 BSPHl, 

CysAsnThrCysValThrGlnThrValAspPheSerLeuAspProThrPheThrlleGlu 
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662 TGCAATACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATTGAG 
ACGTTATGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTC 

ThrlleThrLeuProGlnAspAiaValSerArgThrGlnArgArgGlyArgThrGlyArg 
722 ACAATCACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGG 
TGTTAGTGCGAGGGGGTTCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCC 

GlyLysProGlylleTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAsp 
782 GGGAAGCCAGGCATCT ACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCG AC 
CCCTTCGGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTG 

822 BGLI, 839 DRDl, 

SerSerValLeuCysGluCysTyrAspAiaGlyCysAlaTrpTyrGluLeuThrProAla 
842 TCGTCCGTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCC 
AGCAGGCAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGG 

887 SACI, 

GluThrThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAsp 
902 GAGACTACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGAC 
CTCTGATGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTG 

A 

937 SMAI XMAI, 

HisLeuGiuPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHisPheLeu 
962 CATCTTGAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTA 
GTAGAACTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGAT 

A 

991 STUI, 

SerGlnThrLysGlnSerGlyGluAsnLeuProTyrLeuValAlaTyrGlnAlaThrVai 
1022 TCCCAGACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTG 
AGGGTCTGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCAC 

A 

1075 DRA3r 

CysAlaArgAlaGlnAlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArg 
1082 TGCGCTAGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGC 
ACGCGATCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCG 

LeuLysProThrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAlaValGlnAsn 
114 2 CTCAAGCCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAAT 
GAGTTCGGGTGGGAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTA 

A 

1156 NCOI, 

GluIieThrLeuThrHisProValThrLysTyrlleMetThrCysMetSerAlaAspLeu 
1202 GAAATCACCCTGACGCACCCAGTCACCAAATACATCATGACATGGATGTCGGCCGACCTG 
CTTTAGTGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGAC 

^ A A ^ ^ 

1236 BSPHl, 1240 DRDl, 1243 AVA3, 1251 EAGl XMA3, 1256 DRDl, 



GluValValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeuAlaAlaTyr 
1262 GAGGTCGTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTAT 
rTrr.ARCAGTGCTCGTGGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATA 
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CysLeuSerThrGlyCysValVallleValGlyArgValValLeuSerGlyLysProAla 
1322 TGCCTGTCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCA 
ACGGACAGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGT 

1375 NAEI. 

IlelleProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGln 
1382 ATCATACCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAG 
TAGTATGGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTC 

1391 DRDl, 

HisLeuProTyrlleGluGlnGlyMetMetLeuAlaGluGlnPheLysGlnLysAlaLeu 
1442 CACTTACCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTC 
GTGAATGGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAG 

GlyLeuLeuGlnThrAlaSerArgGlnAlaGluVallieAlaProAlaVaiGinThrAsn 
1502 GGCCTCCTGCAGACCGCGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAAC 
CCGGAGGACGTCTGGCGCAGGGCAGTCCGTCTCGAATAGCGGGGACGACAGGTCTGGTTG 

A A 

1508 PSTI, 1513 TTH3I, 

TrpGlnLysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSerGlylleGin 
1562 TGGCAAAAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAA 
ACCGTTTTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTT 

A A 

1571 XHOI, 1592 NDEI, 

TyrLeuAlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeuMetAlaPhe 
1622 TACTTGGCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTT 
ATGAACCGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAA 

A 

1649 BSTE2, 

ThrAlaAlaValThrSerProLeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGiy 
1682 ACAGCTGCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGG 
TGTCGACGACAGTGGTCGGGTGAT7GGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCC 

A 

1683 ALWNl PVU2, 

GlyTrpValAlaAlaGlnLeuAlaAlaProGlyAlaAlaThrAlaPheValGlyAlaGly 
1742 GGGTGGGTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGC 
CCCACCCACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCG 

A 

1800 ESPl, 

LeuAlaGlyAIaAlalleGlySerValGlyLeuGlyLysValLeuIleAspIleLeuAla 
1802 TTAGCTGGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCA 
AATCGACCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGT 

A 

1808 KASl NARI, 

GlyTyrGlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMetSerGlyGluVai 
1862 GGGTATGGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTC 
CCCATACCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAG 
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1884 SACI, 1905 BSPHl, 

ProSerThrGluAspLeuValAsnLeuLeuProAlalleLeuSerProGlyAlaLeuVal 
1922 CCCTCCACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTA 
GGGAGGTGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCAT 

1934 TTH3I, 

ValGlyValValCysAlaAlalleLeuArgArgHisValGlyProGlyGluGlyAlaVal 
1982 GTCGGCGTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTG 
CAGCCGCACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCAC 

2010 NAEI, 2023 SMAI XMAI, 

GlnTrpMetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSerProThrHis 
2042 CAGTGGATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGC AC 
GTCACCTACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTG 

2073 SMAI XMAI, 2099 DRA3, 

TyrValProGluSerAspAlaAlaAiaArgValThrAlalleLeuSerSerLeuThrVal 
2102 TACGTGCCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACf GTA 
ATGCACGGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACAT 

A 

2121 PVU2, 

ThrGlnLeuLeuArgArgLeuHisGlnTrpIieSerSerGluCysThrThrProCysSer 
2162 ACCCAGCTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCC 
TGGGTCGAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGG 

A A 

2165 ALWNl, 2170 MST2, 

GlySerTrpLeuArgAspIleTrpAspTrpIleCysGluValLeuSerAspPheLysThr 
2222 GGTTCCTGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACC 
CCAAGGACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGa- 

A 

2226 ECONl, 

TrpLeuLysAlaLysLeuMetProGlnLeuProGlylleProPheValSerCysGlnArg 
2282 TGGCTAAAAGCTAAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGC 
ACCGATTTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCG 

^ A A 

2291 ESPl, 2306 PV02, 2316 BAMHI, 

GlyTyrLysGlyValTrpArgGlyAspGlylleMetHisThrArgCysHisCysGiyAla 
2342 GGGT ATAAGGGGGTCTGGCGAGGGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCT 
CCCATATTCCCCCAGACCGCTCCGCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGA 

GluIleThrGiyHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArg 
2402 GAGATCACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGG 
CTCTAGTGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCC 

^ AAA 

2431 BSABl, 2447 AVR2, 2454 SSE83871, 2455 PSTI, 

AsnMetTrpSerGiyThrPheProIleAsnAiaTyrThrThrGlyProCysThrProLeu 
2462 AACATGTGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTT 
TTGTACACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAA 
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2486 ASEl, 2503 APAI, * 

E^roAlaProAsnTyrThrPheAlaLeuTrpArgValSerAlaGluGluTyrValGluIle 
2522 CCTGCGCCGAACTACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGAATACGTGGAGATA 
GGACGCGGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTATGCACCTCTAT 

2559 PSTIr 

ArgGXnValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysPro 
2582 AGGCAGGTGGGGGACTTCCACTACGTGACGGGTATGACTACTG ACAATCTTAAATGCCCG 
TCCGTCCACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAGAATTTACGGGC 

2600 'DRA3, 

CysGlnValProSerProGluPhePheThrGluLeuAspGlyValArgLeuHisArgPhe 
2642 TGCCAGGTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTT 
ACGGTCCAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAA 

AlaProProCysLysProLeuLeuArgGluGluValSerPheArgValGlyLeuHisGlu 
2702 GCGCCCCCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAA 
CGCGGGGGGACGTTCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTT 

TyrProValGlySerGlnLeuProCysGiuProGluProAspValAlaValLeuThrSer 
2762 TACCCGGTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCC 
ATGGGCCATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGG 

2763 HGIE2, 2815 AAT2, 

MetLeuThrAspProSerHisIIeThrAlaGluAlaAlaGlyArgArgLeuAlaArgGly 
2822 ATGCTCACTGATCCCTCCCAT ATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGA 
TACGAGTGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCT 

2856 EAGl XMA3, 

SerProProSerValAlaSerSerSerAlaSerGlnLeuSerAlaProSerLeuLysAIa 
2882 TCACCCCCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCA 
AGTGGGGGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGT 

A A 

2895 BALI, 2909 NHEI, 

ThrCysThrAlaAsnHisAspSerProAspAiaGluLeuIleGluAlaAsnLeuLeuTrp 
2942 ACTTGCACCGCTAACCATGACTCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGG 
TGAACGTGGCGATTGGTACTGAGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACC 

A A 

2972 ESPl, 2975 SACI, 

ArgGlnGIuMetGlyGlyAsnlleThrArgValGluSerGluAsnLysValVallleLeu 
3002 AGGCAGGAGATG6GCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTG 
TCCGTCCTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGAC 

AspSerPheAspProLeuValAlaGluGluAspGluArgGluIleSerValProAiaGlu 
3062 GACTCCTTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAA 
CTGAGGAAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTT 

3102 BGL2. 
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IleLeuArgLysSerArgArgPheAlaGlnAlaLeuProValTrpAlaArgProAspTyr 
3122 ATCCTGCGGAAGTCTCGGAGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTAT 
TAGGACGCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATA 

3149 ALWNl, 3170 EAGl XMA3, 

AsnProProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGly 
3182 AACCCCCCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGC 
TTGGGGGGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCG 

3223 HGrE2, 3235 NCOI, 

CysProLeuProProProLysSerProProValProProProArgLysLysArgThrVal 
3242 TGCCCGCTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTG 
ACGGGCGAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCAC 

ValLeuThrGluSerThrLeuSerThrAlaLeuAlaGluLeuAlaThrArgSerPheGly 
3302 GTCCTCACTGAATCAACCCTATCTACTGCCTTGGCCGAGCTCGCC ACCAGAAGCTTTGGC 
CAGGAGTGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCG 

3338 SACI, 3352 HIND3, 

SerSerSerThrSerGlylleThrGiyAspAsnThrThrThrSerSerGluProAiaPro 
3362 AGCTCCTCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCT 
TCGAGGAGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGA 

SerGlyCysProProAspSerAspAlaGluSerTyrSerSerMetProProLeuGluGly 
3422 TCTGGCTGCCCCCCCGACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGG 
AGACCGACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCC 

3443 EAM11051, 

GiuProGiyAspProAspLeuSerAspGlySerTrpSerThrValSerSerGluAlaAsn 
•3482 GAGCCTGGGGATCCGGATCTTAGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAAC 
CTCGGACCCCTAGGCCTAGAATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCCGGTTG 

A A A 

3490 BAMHI, 3491 BSABl, 3493 BSPEl, 

AlaGluAspValValCysCysSerMetSerTyrSerTrpThrGlyAiaLeuValThrPro 
3542 GCGGAGGATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCG 
CGCCTCCTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGC 

A 

3595 DRA3, 

CysAlaAlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHis 
3602 TGCGCCGCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTCAC 
ACGCGGCGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTG 

A A ^ 

3606 SAC2, 3617 ALWNl, 3661 PFLMl, 

HisAsnLeuValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLysLysValThr 
3662 CACAATTTGGTGTATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACA 
GTGTTAAACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCG7CTTCTTTCAGTGT 

A 

3687 DRA3, 

PheAspArgLeuGlnValLeuAspSerHisTyrGlnAspValLeuLysGluValLysAia 
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3722 TTTGACAGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCA 
AAACTGTCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGT 

' AiaAiaSerLysValLysAiaAsnLeuLeuSerValGluGluAlaCysSerLeuThrPro 
3782 GCGGCGTCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCC 
CGCCGCAGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGG 

3822 HIND3, 

ProHisSerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAlaArg 
3842 CCACACTCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGA 
GGTGTGAGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCT 

3881 •AAT2, 3896 BGLI, 

LysAlaValThrHisIleAsnSerValTrpLysAspLeuLeuGluAspAsnValThrPro 
3902 AAGGCCGTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCA 
TTCCGGCATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGT 

IleAspThrThrlleMetAiaLysAsnGluValPheCysValGInProGluLysGlyGly 
3962 ATAGACACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGT 
TATCTGTGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCA 

ArgLysProAlaArgLeuIleValPheProAspLeuGlyValArgValCysGLuLysMet 
4022 CGTAAGCCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATG 
GCATTCGGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTAC 

AlaLeuTyrAspVaLValThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPhe 
4082 GCTTTGTACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTC 
CGAAACATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAG 

GinTyrSerProGlyGinArgValGluPheLeuValGlnAlaTrpLysSerLysLysThr 
414 2 CAATACTCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACC 
GTTATGAGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGG 

4166 ECORI, 

ProMetGlyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIle 
4202 CCAATGGGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTGAGAGCGACATC 
GGTTACCCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAG 

4235 DRDl, 4242 ALWNl, 

ArgThrGluGluAlalleTyrGlnCysCysAspLeuAspProGlnAlaArgVaiAlalle 
4262 CGTACGGAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATC 
GCATGCCTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAG 

A 'V 

4307 BGLI, 4314 BALI, 

LysSerLeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsn 
4322 AAGTCCCTCACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAAC 
TTCAGGGAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTG 

A 

4 351 APAI, 

CysGlyTyrArgArgCysArgAlaSerGlyValLeuThrThrSerCysGlyAsnThrLeu 
4382 TGCGGCTATCGCAGGTGCCGCGCGAGCG6CGTACTGACAACTAGCTGTGGTAACACCCTC 
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ACGCCGATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAG 

ThrCysTyrlleLysAIaArgAlaAlaCysArgAlaAlaGlyLeuGinAspCysThrMet 
4 442 ACTTGCTACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATG 

TGAACGATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTAC 

A, 

4458 SMAI XMAI, 

LeuValCysGiyAspAspLeuValVallleCysGluSerAiaGlyValGlnGluAspAla 
4502 CTCGTGTGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCG 
GAGCACACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGC 

A A 

4514 DRDl, 4517 TTH3I, 

AiaSerLeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGlyAspPro 
4562 GCGAGCCTGAGAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCC 
CGCTCGGACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGG 

ProGlnProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAla 
4622 CCACAACCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCC 
GGTGTTGGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGG 

4643 SACI, 

HisAspGlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeioAla 
4 682 CACGACGGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCG 
GTGCTGCCGCGACCTTTCTCCCAGATGATGGAGT6GGCACTGGGATGTTGGGGGGAGCGC 

4737 NRUI, 

ArgAlaAlaTrpGIuThrAiaArgHisThrProVaiAsnSerTrpLeuGiyAsnlielle 
4742 AGAGCTGCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCT AGGCAACATAATC 
TCTCGACGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAG 

MetPheAlaProThrLeuTrpAlaArgMetlleLeuMetThrHisPhePheSerValLeu 
4802 ATGTTTGCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTT 
TACAAACGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAA 

4612 PFLMl, 4813 DRA3, 

IleAlaArgAspGlnLeuGluGlnAlaLeuAspCysGluIleTyrGlyAlaCysTyrSer 
4862 ATAGCCAGGGACCAGCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACTCC 
TATCGGTCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGG 

A 

4899 BGL2, . 

lleGluProLeuAspLeuProProIlelleGlnArgLeuHisGlyLeuSerAlaPheSer 
4922 ATAGAACCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCA 
TATCTTGGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGT 

A 

4960 NCOI, 

LeuHisSerTyrSerProGlyGluIleAsnArgValAlaAlaCysLeuArgLysLeuGly 
4982 CTCCACAGTTACTCTCCA6GTGAAATCAATAGGGTGGCCGCATGCCTCAGAAAACTTGGG 
GAGGTGTCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCC 

A A 

5021 SPHI, 5041 KPNI, 
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ValProProLeuArgAlaTrpArgHisArgAlaArgSerValArgAlaArgLeuLeuAla 
5042 GTACCGCCCTTGCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCC 
CATGGCGGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGG 

A A 

5070 APAI, 5097 BALI, 

ArgGLyGlyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrpAlaValArgThrLys 
5102 AGAGGAGGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGGGCAGT AAGAAC AAAG 
TCTCCTCCGTCCCGACGGTATACACCGTTCATGGAGAA6TTGACCCGTCATTCTTGTTTC 

5119 NDEI, 

LeuLysLeuThrProIleAlaAlaAlaGlyGlnLeuAspLeuSerGlyTrpPheThrAla 
5 1 62 CTCAAACTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCT 
GAGTTTGAGTGAGGTTATCGCCGGCGACCGGTCGACCTGAACAGGCCGACCAAGTGCCGA 

5180 NOTI, 5181 EAGl XMA3, 5188 BALI, 5192 PVU2, 

GlyTyrSerGlyGlyAspIleTyrHisSerVaiSerHisAlaArgProArgTrpIleTrp 
5222 GGCT ACAGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGG 
CCGATGTCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACC 

A 

524 6 0RA3, 

PheCysLeuLeuLeuLeuAlaAlaGlyValGlylleTyrLeuLeuProAsnArgOP 
5282 TTTTGCCTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGATGAAGG 
AAAACGGATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGGGTTGGCTACTTCC 

A M 

5301 PSTI, 5331 HGIE2, 



5342 TTGGGGTAAACACTCCGGCCTAAAAAAAAAAAAAAATCTAGAACCCGAGTCGAC 
AACCCCATTTGTGAGGCCGGATTTTTTTTTTTTTTTAGATCTTGGGCTCAGCTG 

A 

5378 XBAI, 5390 SALI, 
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MetAlaAlaTyrAlaAlaGlnGlyTyrLysValLeuVaiLeuAsn 
2 AGCTTACAAAACAAAATGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTACTCAAC 
. . TCGAATGTTTTGTTTTACCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTG 

1 HIND3, 24 NDEI, 52 SCAI, 

ProSerValAlaAlaThrLeuGlyPheGlyAlaTyrMetSerLysAlaHisGiylieAsp 
62 CCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGAT 
GGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGCTA 

A 

116 CLAI, 

PrdAsnlleArgThrGlyValArgThrlleThrThrGlySerProIleThrTyrSerThr 
122 CCTAACATC AGGACCGGGGTGAGAACAATTACCACTGGCAGCCCCATC ACGTACTCC ACC 
GGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATGAGGTGG 

TyrGlyLysPheLeuAlaAspGlyGlyCysSerGiyGlyAlaTyrAspIlellelleCys 
182 T ACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCTTATGACATAATAATTTGT 
ATGCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTA7TAAACA 

AspGluCysHisSerThrAspAlaThrSerlleLeuGlylleGlyThrValLeuAspGln 
24 2 GACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATTGGCACTGTCCTTGACCAA 
CTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAACCGTGACAGGAACTGGTT 

AlaGluThrAlaGlyAlaArgLeuValValLeuAlaThrAlaThrProProGlySerVal 
302- GCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGCTCCGTC 
CGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGGCAG 

A 

303 ALWNl, 

ThrVaiProHisProAsnlieGiuGluValAlaLeuSerThrThrGlyGluIieProPhe 
36'2 ACTGTGCCCCATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATCCCTTTT 
TGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGAAAA 

TyrGlyLysAlalleProLeuGluVallleLysGlyGlyArgHisLeuIlePheCysHis 
422 TACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTCTGTCAT 
ATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAGACAGTA 

SerLysLysLysCysAspGluLeuAlaAlaLysLeuValAlaLeuGlylleAsnAlaVal 
4 82 TCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAATGCCGTG 
AGTTTCTTCTTCACGCTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTACGGCAC 

AlaTyrTyrArgGlyLeuAspValSerVallleProThrSerGlyAspValValValVal 
542 GCCT ACTACCGCGGTCTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTCGTCGTG 
CGGATGATGGCGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAGCAGCAC 

A y\ 

550 SAC2, 560 DRDl, 

AlaThrAspAlaLeuMetThrGlyTyrThrGlyAspPheAspSerVallleAspCysAsn 
602 GCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGACTGCAAT 
CGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTGACGTTA 

A 

615 BSPHl . 
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662 ACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATtGAGACAA.TC 
TGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTC7GTTAG 

ThrLeuProGlnAspAlaValSerArgThrGlnArgArgGlyArgThrGlyArgGlvLys 
722. ACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGGAAG 
TGCGAGGGGGTTCTACGACAGA6GGCGTGAGTTGCAGCCCCGTCCTGACCGTCCCCCTTC 

ProGlylleTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAspSerSer 
782 CCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACTCGTCC 
GGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGCAGG 

816 3GLI, 833 DRDl, 

ValLeuCysGiuCysTyrAspAlaGiyCysAlaTrpTyrGluLeuThrProAiaGluThr 
84 2 GTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCCGAGACT 
CAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTCTGA 

881 SACI, 

ThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAsoHisLeu 
902 ACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTT 
TGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAA 

931 SMAI XMAI, 

GluPheTrpGluGlyValPheThrGiyLeuThrHisIleAspAlaHisPheLeuSerGin 
962 GAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAG 
CTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGG7C 

985 STUI, 

ThrLysGlnSerGlyGluAsnLeuProTyrLeuValAlaTyrGlnAlaThrValCysAla 
1022 ACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCT 
TGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGA 

1069 DRA3, 

ArgAlaGlnAlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArgLeuLys 
1082 AGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATG7GGAA6TGTTTGATTCGCCTC AAG 
TCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCGGAGTTC 

ProThrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAlaValGlnAsnGluIle 
. * 1142 CCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAATGAAATC 
GGGTGGGAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTACTTTAG 

1150 NCOI, 

ThrLeuThrHisProValThrLysTyrlleMetThrCysMetSerAiaAspLeuGluVal 
1202 ACCCTGACGCACCCAGTCACCAAATACATCATGACATGCATGTCGGCCGACCTGGAGGTC 
TGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGACCTCCAG 

1230 BSPHl, 1234 DRDl, 1237 AVA3, 1245 EAGl XMA3, 1250 DRDl, 



ValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeuAlaAlaTyrCysLeu 
1262 GTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGCCTG 
CAGTGCTCGTGGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATAACGGAC 
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SerThrGlyCysValVaiIleValGlyArgValValLeuSerGlyLysProAia:ie:ie 
1322 TCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAAtCATA 
AGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAG7AT 

1369 NAEI, 

ProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGlnHisLeu 
1382 CCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACTTA 
GGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTGAAT 

1385 DRDl, 

ProTyrlleGluGlnGlyMetMetLeuAlaGiuGinPheLysGlnLysAiaLeuGiyLeu 
144 2 CCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCGGCCTC 
GGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAG 

LeuGlnThrAiaSerArgGlnAlaGluVallieAlaProAlaValGlnThrAsnTrpGln 
1502 CTGCAGACCGCGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAACTGGCAA 
GACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACCGTT 

1502 PSTI, 1507 TTH3I, 

LysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSerGlylleGlnTyrLeu 
1562 AAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAATACTTG 
TTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATGAAC 

1565 XHOI, 1586 NDEI, 

AlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeuMetAlaPheThrAla 
1622 GCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTTACAGCT 
CGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGTCGA 

1643 BSTE2, 1677 ALWNl PVU2, 

AiaValThrSerProLeuThrThrSerGinThrLeuLeuPheAsnlleLeuGiyGlyTrp 
1632 GCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGG 
CGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACC 

ValAlaAlaGlnLeuAlaAlaProGlyAlaAlaThrAlaPheValGlyAlaGlyLeuAla 
1742 GTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTAGCT 
CACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAATCGA 

A 

1794 ESPl, 

GlyAlaAlalleGlySerValGlyLeuGlyLysValLeuIleAspIleLeuAlaGlyTyr 
1802 GGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGGTAT 
CCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCCATA 

1802 KASl NARI, 

GlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMetSerGlyGiuValProSer 
1862 GGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTCCCCTCC 
CCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGG 

A A 

1878 SACI, 1899 BSPHl, 
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ThrGluAspLeuValAsnLeuLeuProAlalleLeuSerProGlyAlaLeuValVaiGiy 
1922 ACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTC3TAGTCGGC 
7GCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAGCCG 
^» 

- 1928 TTH3I, 

ValValCysAlaAlalleLeuArgArgHisValGiyProGlyGluGlyAlaValGinTrp 
1982 GTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGG 
CACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACC 

A A 

200< NASI, 2017 SMAI XMAI, 

MetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisVaiSerProThrHisTyrVal 
2042 ATQAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGCACTACGTG 
TACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCAC 

A 

206*7 SMAI XMAI, 2093 DRA3, 

ProGluSerAspAlaAlaAlaArgValThrAlalleLeuSerSerLeuThrValThrGln 
2102 CCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACCCAG 
GGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACATTGGGTC 

2115 PVU2, 2159 ALWNl, 

LeuLeuArgArgLeuHisGlnTrpIleSerSerGluCysThrThrProCysSerGlySer 
2162 CTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGTTCC 
GAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGGCCAAGG 

A 

A 

2164 MST2, 2220 ECONl, 

TrpLeuArgAspIleTrpAspTrpIleCysGluVaileuSerAspPheLysThrTrpLeu 
2222 TGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCG ACTTTAAGACCTGGCTA 
ACCGATTCCCTGTAGACCCTGACCTATACGCfCCACAACTCGCTGAAATTCTGGACCGAT 

LysAlaLysLeuMetProGlnLeuP.roGiylieProPheValSerCysGlnArgGiyTyr 
2232 AAAGCTAAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGGTAT 
TTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATA 

A ^ A 

2285 ESPl, 2300 PVU2, 2310 BAMHI, 

LysGlyValTrpArgGlyAspGlylleMetHisThrArgCysHisCysGlyAlaGluIle 
2342 AAGGGGGTCTGGCGAGGGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAGATC 
T7CCCCCAGACCGCTCCCCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGACTCTAG 

ThrGlvHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArgAsnMet 
2402 ACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATG 
TGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTGTAC 

A A A A 

2425 BSABl, 2441 AVR2, 2448 SSE83871, 2449 PSTI, 

TroSerGlyThrPheProIleAsnAlaTyrThrThrGlyProCysThrProLeuProAla 
2462 TGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCTGCG 
ACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGACGC 

A A 

2480 ASEl. 24 97 APAI, 

ProAsnTyrThrPheAlaLeuTrpArgValSerAlaGluGluTyrValGluIleArgGln 
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2322 CCGAACTACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGAATACGTGGAGATAAGGCA3 
GGCT7GATGTGCAAGCGCGA7ACCTCCCACAGACGTCTCCTTATGCACCTCTATTCCGTC 

2553 PSTI, 

ValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysProCysGln 
2582 GTGGGGGACTTCCACTACGTGACGGGTATGACTAC7GACAATCTTAAATGCCCGTGCC AG 
CACCCCCTGAAGGTGAT6CACTGCCCATACTGATGACTGTTAGAAT7TACGGGCACGGTC 

2594 DRA3, 

Vai?roSerProGiuPhePheThrGluLeuAspGlyValArgLeuHisArg?heAla?rc 
2642 GTCCCATCGCCCGAATT7T7CACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCC 
CAGGG7AGCGGGC77AAAAAG7GTC77AACC7GCCCCACGCGGA7G7ATCCAAACGCGGG 

ProCysLysProLeuLeuArgGluGluValSerPheArgVaiGlyLeuHisGluTyrPro 
2702 CCCTGCAAGCCC77GC7GCGGGAGGAGG7ATCA77CAGAG7AGGACTCCACGAA7ACCCG 
GGGACG77CGGGAACGACGCCC7CCTCCA7AG7AAG7C7CA7CC7GAGG7GC77A7GGGC 

2757 HGIE2, 

ValGlySerGinLeuProCysGluProGluProAspValAlaValLeuThrSerMetLeu 
27 62 G7AGGG7CGCAA77ACC77GCGAGCCCGAACCGGACGTGGCCG7G77GACG7CCA7GC7C 
CA7CCCAGCG77AATGGAACGCTCGGGC77GGCC7GCACCGGCACAAC7GCAGGTACGAG 

2809 AA72, 

ThrAspProSerHisIleThrAlaGluAlaAlaGlyArgArgLeuAlaArgGiySerPro 
2322 ACTGATCCC7CCCA7A7AACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCACCC 
TGACTAGGGAGGGTATAT7GTCG7CTCCGCCGGCCCGCTTCCAACCGCTCCCCTAGTGGG 

2850 EAGl XMA3, 

ProSerValAlaSerSerSerAlaSerGlnLeuSerAlaProSerLeuLysAlaThrCys 
2832 CCC7C7G7GGCCAGC7CC7CGGC7AGCCAGC7ATCCGC7CCA7C7C7CAAGGCAAC7TGC 
GGGAGACACCGG7CGAGGAGCCGA7CGG7CGA7AGGCGAGG7AGAGAG77CCGT7GAACG 

2889 BALI, 2903 NHEI, 

ThrAlaAsnHisAspSerProAspAlaGluLeuIleGluAlaAsnLeuLeu7rpArgGin 
2942 ACCGC7AACCA7GAC7CCCC7GA7GC7GAGC7CA7AGAGGCCAACC7CC7A7GGAGGCAG 
7GGCGA77GG7AC7GAGGGGAC7ACGAC7CGAG7A7C7CCGG7TGGAGGA7ACC7CCG7C 

2966 ESPl, 2969 SACI, 

GluMetGlyGlyAsnIle7hrArgValGluSerGluAsnLysValValIleLeuAspSer 
3002 GAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTGGACTCC 
C7C7ACCCGCCG7TG7AGTGG7CCCAAC7CAG7C7777G777CACCAC7AAGACC7GAGG 

PheAspProLeuValAlaGluGluAspGluArgGluIleSerValProAlaGluIleLeu 
3062 T7CGA7CCGC77G7GGCGGAGGAGGACGAGCGGGAGA7CTCCG7ACCCGCAGAAA7CC7G 
AAGC7AGGCGAACACCGCC7CCTCC7GCTCGCCC7C7AGAGGCA7GGGCG7C777AGGAC 

3096 BGL2, 

ArgLysSerArgArgPheAlaGlnAlaLeuProVal7rpAlaArgProAsp7yrAsnPro 
3122 CGGAAGTC7CGGAGAT7CGCCCAGGCCC7GCCCG777GGGCGCGGCCGGAC7A7AACCCC 
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GCC7TCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCC7GA7A?;G3GG 

3143 ALWNl, 3164 EAGl XMA3, 

• ProLeuValGiuThrTrpLysLysProAspTyrGluProProValValHisGiyCysPro 
3182 CCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGCCCG 
GGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACGGGC 

3217 HGIE2, 3229 NCOI, 

LeuProProProLysSerProProValProProProArgLysLysArgThrValValLeu 
324 2 CTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTGGTCCTC 
GAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAG 

ThrGluSerThrLeuSerThrAlaLeuAlaGluLeuAlaThrArgSerPheGlySerSer 
3302 ACTGAATCAACCCTATCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGCAGCTCC 
TGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCGAGG 

3332 SACI, 3346 HIND3, 

SerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGluProAlaProSerGly 
3362 TCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCTGGC 
AGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGAAGACCG 

CysProProAspSerAspAlaGluSerTyrSerSerMetProProLeuGluGlyGluPro 
3422 TGCCCCCCCGACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCT 
ACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTCGGA 

3437 EAM11051, 

GlyAspProAspLeuSerAspGlySerTrpSerThrValSerSerGluAlaAsnAiaGiu 
3 4 S 2 GGGGATCCGGATCTTAGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAACGCGGAG 
CCCCTAGGCCTAGAATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCCGGTTGCGCCTC 

A A. A 

3484 BAMHI, 3485 BSABl, 3487 BSPEl, 

AspValValCysCysSerMetSerTyrSerTrpThrGlyAlaLeuValThrProCysAla 
3542 GATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCGTGCGCC 
CTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACGCGG 

A A 

3589 DRA3, 3600 SAC2, 

AlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHisHisAsn 
3 6 C 2 GCGGAAGAACAGAAACTGCCCATCAATGCACTAAGC AACTCGTTGCT ACGTCACCACAAT 
CGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGTGTTA 

A A 

3611 ALWNl, 3655 PFLMl, 

LeuValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLysLysValThrPheAsp 
3662 TTGGTGT ATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTTGAC 
AACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAGTGTAAACTG 

A 

3681 DRA3, 

ArgLeuGlnValLeuAspSerHisTyrGlnAspValLeuLysGluValLysAlaAlaAla 
3722 AGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCAGCGGCG 
TCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGTCGCCGC 
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SerLysValLysAlaAsnLeuLeuSerValGluGluAlaCysSerLeuThrProProHis 
3782 TCAAAAGTGAAGGCTAACTTGCTATCCGTAGAG3AAGC7TGCAGCCTGACGCCCCCACAC 
AGTTTTCACTTCCGATTGAACGATAG6CATCTCCTTCGAACGTCGGACTGCGGGGGTGTG 

3816HIND3, 

SerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAiaArgLysAia 
3842 TCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAG ACGTCCGT7GCC ATGCCAGAAAGGCC 
AGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTCCGG 

3875 AAT2, 3890 BGLI, 

ValThrHisIieAsnSerValTrpLysAspLeuLeuGluAspAsnValThrProIieAsp 
3902 GTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAATAG AC 
CATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTTATCTG 

ThrThrlieMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGiyArgLys 
3962 ACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGTCGTAAG 
TGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCATTC 

ProAlaArgLeuIleValPheProAspLeuGlyValArgValCysGluLysMetAlaLeu 
4022 CCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTG 
GGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGAAAC 

TyrAspValVaiThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPheGinTyr 
4082 TACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAATAC 
ATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGGTTATG 

SerProGlyGlnArgValGluPheLeuValGlnAlaTrpLysSerLysLysThrProMer 
4142 TCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATG 
AGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGGGGTTAC 

4160 ECORI, 

GiyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIieArgThr 
4202 GGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTGAGAGCGACATCCGTACG 
CCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAGGCATGC 

4229 DRDl, 4236 ALWNl, 

GluGluAlalleTyrGlnCysCysAspLeuAspProGlnAlaArgValAlalleLysSer 
• ' 4 262 GAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAGTCC 
CTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAGTTCAGG 

4301 BGLI, 4308 BALI, 

LeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsnCysGly 
4 322 CTCACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGCGGC 
GAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTGACGCCG 

4 34 5 APAI, 

TyrArgArgCysArgAlaSerGlyValLeuThrThrSerCysGlyAsnThrLeuThrCys 
4 382 TATCGCAGGTGCCGCGCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACTTGC 
ATAGCGTCCACGGCGCGCTCGCC6CATGACTGTTGATCGACACCATTGTGGGAGTGAACG 
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TyrlieLysAlaArgAlaAlaCysArgAlaAlaGiyLeuGlnAspCysThrMetLeuVa: 
4 4 42 TACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTG 
ATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCC7GACGTGGTACGAGCAC 

A 

4452 SMAI XMAI, 

CysGlyAspAspLeuValVallleCysGiuSerAlaGlyValGlnGiuAspAlaAlaSer 
4 502 TGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCGAGC 
ACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCG 

4508 DRDi, 4511 TTH3I, 

LeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGlyAspProProGln 
4 562 CTGAGAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCACAA 
GACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTT 

ProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAlaKisAsp 
4622 CCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCCCACGAC 
GGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTGCTG 

A 

4 637 SACI, 

GlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAlaArgAla 
4 682 GGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGCT 
CCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCGA 

4731 NRUI, 

AiaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelleMetPhe 
4 7 4 2 GCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATCATGTTT 
CGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAAA 

AlaProThrLeuTrpAlaArgMetlleLeuMetThrHisPhePheSerValLeuIleAla 
4 802 GCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGCC 
CGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGG 

A A 

4806 PFLMl, 4807 DRA3, 

ArgAspGlnLeuGluGlnAlaLeuAspCysGiuIieTyrGiyAlaCysTyrSerlleGlu 
4862 AGGGACCAGCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACTCCATAGAA 
TCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCTT 

A 

4893 BGL2, 

ProLeuAspLeuProProIlelleGlnArgLeuHisGlyLeuSerAlaPheSerLeuHis 
4 922 CCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCACTCCAC 
GGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGTGAGGTG 

A 

4954 NCOI, 

SerTyrSerProGlyGluIleAsnArgValAlaAlaCysLeuArgLysLeuGlyValPro 
4 982 AGTTACTCTCCAGGTGAAATCAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGTACCG 
TCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCCCATGGC 

A ^ 

5015 SPHI, 5035 KPNI, 

ProLeuArgAlaTrpArgHisArgAlaArgSerValArgAlaArgLeuLeuAlaArgGly 
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5042 CCCT7GCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCCAGAGGA 
GGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCC? 

5064 APAI, 5091 BALI, 

GlyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrpAlaValArgThrLysLeuLys 
5102 GGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAGCTCAAA 
CCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGT7TCGAGTTT 

5113 NDEI, 

LeiiThrProIleAlaAlaAlaGlyGlnLeuAspLeuSerGlyTrpPheThrAlaGlyTyr 
5162 CTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCTGGC7AC 
GAGTGAGGTTATCGCCGGCGACCGGTCGACCTGAACAGGCCGACCAAGTGCCGACCGATG 

A A A A 

5174 NOTI, 5175 EAGl XMA3, 5182 BALI, 5186 PVU2, 

SerGlyGlyAspIleTyrHisSerValSerHisAlaArgProArgTrpIleTrpPheCys 
5222 AGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGGTTTTGC 
TCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACCAAAACG 

5240 DRA3, 

LeuLeuLeuLeuAlaAlaGlyValGiylleTyrLeuLeuProAsnArgOP 
5282 CTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGATGAATAGTCGAC 
GATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGGGTTGGCTACTTATCAGCTG 

A A 

5295 PSTI, 5336 SALI, 
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MetAlaAlaTyrAiaAlaGlnGlyTyrLysValLeuValLeuAsn 
2 AGCTTACAAAACAAAATGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTACTCAAC 
7CGAATGTTTTGTTTTACCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTG 

1 HIND3, 24 NDEI, 52 SCAI, 

ProSerVaiAlaAlaThrLeuGlyPheGiyAlaTyrMetSerLysAiaHisGiylleAsp 
62 CCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGAT 
GGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGG7TCCGAGTACCCTAGC7A 

116 CLAI, 

ProAsnlleArgThrGlyValArgThrlleThrThrGlySerProIleThrTyrSerThr 
122 CCTAACATCAGGACCGGGGTGAGAACAATTACCACTGGCAGCCCCATCACGT ACTCCACC 
GGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATGAGGTGG 

TyrGlyLysPheLeuAlaAspGlyGlyCysSerGlyGlyAlaTyrAspIiellelleCys 
182 TACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCTTATGACATAATAATTTGT 
ATGCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTATTAAACA 

AspGluCysHisSerThrAspAlaThrSerlleLeuGlylleGlyThrValLeuAspGln 
242 GACGAGTGCCACTCCACGGATGCCACATCCATCTTGG6CATTGGCACTGTCCTTGACCAA 
CTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAACCGTGACAGGAACTGGTT 

AlaGluThrAlaGlyAlaArgLeuValValLeuAlaThrAlaThrProProGlySerVal 
302 GCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGCTCCGTC 
CGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGGCAG 

303 ALWNl, 

ThrValProHisProAsnlleGluGluValAlaLeuSerThrThrGlyGluIieProPhe 
362 ACTGTGCCCCATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATCCCTTTT 
TGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGAAAA 

TyrGlyLysAlalleProLeuGluVallleLysGlyGiyArgHisLeuIlePheCysHis 
422 TACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTCTGTCAT 
ATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAGACAGTA 
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SerLysLysLysCysAspGluLeuAlaAlaLysLeuValAlaLeuGlylleAsnAlaVal 
482 TCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAATGCCGTG 
AGTTTCTTCTTCACGCTGCTTGAGCGGCGTTTCGACCAGC6TAACCCGTAGTTACGGCAC 

AlaTyrTyrArgGlyLeuAspValSerVallleProThrSerGlyAspValValValVal 
542 GCCTACTACCGCGGTCTTGACGTGTCC6TCATCCCGACCAGCGGCGATGTTGTCGTCGTG 
CGGATGATGGCGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAGCAGCAC 

550 SAC2. 560 DRDl, 

AlaThrAspAiaLeuMetThrGiyTyrThrGlyAspPheAspSerVallleAspCysAsn 
602 GCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGACTGCAAT 
CGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTGACGTTA 
<\ 

615 BSPHl, 

ThrCysValThrGlnThrValAspPheSerLeuAspProThrPheThrlleGluThrlle 
662 ACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATTGAGACAATC 
TGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTCTGTTAG 

ThrLeuProGlnAspAlaValSerArgThrGlnArgArgGiyArgThrGlyArgGiyLys 
722 ACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGGAAG 
TGCGAGGGGGTTCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCCCCCTTC 

ProGlylleTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAspSerSer 
782 CCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACTCGTCC 
GGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGCAGG 

816 BGLI, 833 DRDl, 

ValLeuCysGluCysTyrAspAlaGlyCysAlaXrpTyrGluLeuThrProAlaGluThr 
842 GTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCCGAGACT 
CAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTCTGA 

881 SACI, 

ThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAspHisLeu 
902 ACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTT 
TGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAA 

931 SMAI XMAI, 

GluPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHisPheLeuSerGIn 
962 GAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAG 
CTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGGTC 

985 STUI, 

ThrLysGlnSerGlyGiuAsnLeuProTyrLeuValAlaTyrGinAlaThrValCysAla 
1022 ACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCT 
TGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGA 

1069 DRA3, 

ArgAlaGlnAlaProProProSerTrpAspGlnMetTrpLysCysLeuIIeArgLeuLys 
1082 AGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGCCTCAAG 



58/100 



wo 01/38360 



PCTAJSOO/32326 



FIGURE 17 -Page 3" 



TCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTMGCGGAGTTC 

ProThrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAlaValGlnAsnGluIie 
1142 CCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAATGAAATC 
GGGTGGGAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTACTTTAG 

A 

1150 NCOI, 

ThrLeuThrHisProValThrLysTyrlleMetThrCysMetSerAlaAspLeuGluVai 
1202 ACCCTGACGCACCCAGTCACCAAATACATCATGACATGCATGTCGGCCGACCTGGAGGTC 
TGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGACCTCCAG 

A M A A A 

1230 BSPHl, 1234 DRDl, 1237 AVA3, 124 5 EAGl XMA3, 1250 DRDl, 



ValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeuAlaAlaTyrCysLeu 
1262 GTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGC7TTGGCCGCGTATTGCCTG 
CAGTGCTCGTGGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATAACGGAC 

SerThrGlyCysValVallieValGlyArgValValLeuSerGlyLysProAlallelle 
1322 TCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATCATA 
AGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGtTAGTAT 

A 

1369 NAEI, 

ProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGlnHisLeu 
13S2 CCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACTTA 
GGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTGAAT 

A 

1385 DRDl, 

ProTyrlleGluGlnGlyMetMetLeuAlaGluGlnPheLysGlnLysAlaLeuGlyLeu 
1442 CCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCGGCCTC 
GGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAG 

LeuGlnThrAlaSerArgGlnAlaGluVallleAlaProAlaValGlnThrAsnTrpGln 
1502 CTGCAGACCGCGTCCCGTCAGGCAGAGGT7ATCGCCCCTGCTGTCCAGACCAACTGGCAA 
GACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACCGTT 

A A 

1502 PSTI, 1507 TTH3I, 

LysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSerGlylleGlnTyrLeu 
• 1562 AAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAATACTTG 
TTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATGAAC 

A y\ 

1565 XHOI, 1586 NDEI, 

AlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeuMetAlaPheThrAla 
1622 GCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTTACAGCT 
CGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGTCGA 

A A 

1643 BSTE2, 1677 ALWNl PVU2, 

AiaValThrSerProLeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGlyGlyTrp 
1682 GCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGG 
CGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACC 
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ValAlaAlaGlnLeuAlaAlaProQlyAlaAlaThrAlaPheValGlyAla^lyLeuAia 
1742 G7GGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTAGC? 
CACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAATCGA 

1794 ESPl, 

GlyAlaAlalleGlySerValGlyLeuGlyLysValLeuIleAspIleLeuAiaGiyTyr 
1802 GGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGGTAT 
CCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCCATA 

1802 KASl NARI, 

GiyAlaGlyValAlaGlyAlaLeuValAlaPheLysIieMetSerGlyGluVaiProSer 
1862 GGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTCCCCTCC 
CCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGG 

1878 SACI, 1899 BSPHl, 

ThrGluAspLeuValAsnLeuLeuProAlalleLeuSerProGlyAlaLeuValValGly 
1922 ACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTAGTCGGC 
TGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAGCCG 

1928 TTH3I, 

ValValCysAlaAlalleLeuArgArgHisValGlyProGlyGluGlyAlaValGlnTro 
1982 GTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGG 
CACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACC 

A A 

2004 NAEI, 2017 SMAI XMAI, 

MetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSerProThrHisTyrVal 
204 2 ATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGCACTACGTG 
TACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCAC 

2067 SMAI XMAI, 2093 bflA3, 

ProGluSerAspAlaAiaAlaArgValThrAlalleLeuSerSerLeuThrValThrGin 
2102 CCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACCCAG 
GGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACATTGGGTC 

2115 PVU2, 2159 ALWNl, 

LeuLeuArgArgLeuHisGlnTrpIleSerSerGluCysThrThrProCysSerGlySer 
2162 CTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGTTCC 
GAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGGCCAAGG 

^ A 

2164 MST2, 2220 ECONl, 

TrpLeuArgAspIleTrpAspTrpIleCysGluValLeuSerAspPheLysThrTrpLeu 
2222 TGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACCTGGCTA 
ACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGGACCGAT 

LysAlaLysLeuMetProGlnLeuProGlylleProPheValSerCysGlnArgGlyXyr 
2282 AAAGCTAAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGGTAT 
TTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATA 

A A A 

2285 ESPl, 2300 PVU2, 2310 BAMHI, 
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LysGlyValTrpArgGlyAspGlylleMetKisThrArgCysHisCysGlyAlaGiuIie 
2342 AAGGGGGTCTGGCGAGGGGAtGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAGATC 
TTCCCCCAGACCGCTCCCCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGACTC7AG 

ThrGlyHisValLysAsnGlyThrMetArglleValGiyProArgThrCysArgAsnMet 

2402 ACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATG 

TGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTGTAC 

/\ AAA 

2425 6SAB1, 2441 AVR2, 2448 SSC83871, 2449 PSTI, 

TrpSerGlyThrPheProIleAsnAlaTyrThrThrGlyProCysThrProLeuProAla 
24 62 TGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCTGCG 
ACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGACGC 

A A 

2480 ASEl, 2497 APAI, 

ProAsnTyrThrPheAlaLeuTrpArgValSerAlaGluGluTyrValGluIleArgGln 
2522 CCGAACT ACACGTTCGCGCT ATGGAGGGTGTCTGCAGAGGAAT ACGTGGAGATAAGGCAG 
GG.CTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTATGCACCTCTATTCCGTC 

A 

2553 PSTI, 

ValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysProCysGln 
2582 GTGGGGGACTTCCACT ACGTGACGGGTATGACTACTGACAATCTTAAATGCCCGTGCCAG 
CACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAGAATTTACGGGCACGGTC 

A 

2594 DRA3, 

ValProSerProGluPhePheThrGluLeuAspGlyValArgLeuHisArgPheAlaPro 
264 2 GTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCC 
CAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAACGCGGG 

ProCysLysProLeuLeuArgGluGluValSerPheArgValGlyLeuHisGluTyrPro 
27C2 CCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAATACCCG 
GGGACGTTCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTTATGGGC 

A 

2757 HGIE2, 

ValGlySerGinLeuProCysGluProGiuProAspValAlaValLeuThrSerMetLeu 
2762 GTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCATGCTC 
CATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTACGAG 

A 

2809 AAT2, 

ThrAspProSerHisIleThrAlaGluAiaAlaGlyArgArgLeuAlaArgGlySerPro 
2822 ACTGATCCCTCCC ATAT AACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCACCC 
TGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCTAGTGGG 

A 

2850 EAGl XMA3, 

ProSerValAlaSerSerSerAlaSerGlnLeuSerAlaProSerLeuLysAlaThrCys 
2882 CCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCAACTTGC 
GGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGTTGAACG 

A A 

2889 BALI, 2903 NHEI, 
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ThrAlaAsnHisAspSerProAspAlaGluLeuIleGluAlaAsnLeuLeuTrDArgGir. 
2942 ACCGCTAACCATGACTCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGGAGGCAG 
TGGCGATTGGTACTGAGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACCTCCGTC 

... 2966 ESPl, 2969 SACI, 

Gl'uMetGlyGlyAsnlleThrArgValGluSerGluAsnLysValVailleLeuAspSer 
3002 GAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTGGACTCC 
CTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGACCTGAGG 

PheAspProLeuValAlaGluGluAspGluArgGluIleSerVaiProAlaGiuIleLeu 
3062 TTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAAATCCTG 
AAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTTTAGGAC 

3096 BGL2, 

ArgLysSerArgArgPheAlaGlnAlaLeuProValTrpAlaArgProAspTyrAsnPro 
3122 CGGAAGTCTCGGAGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTATAACCCC 
GCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATTGGGG 

3143 ALWNl, 3164 EAGl XMA3, 

ProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGlyCysPro 
3182 CCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGCCCG 
GGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACGGGC 

3217 HGIE2, 3229 NCOI, 

LeuProProProLysSerProProValProProProArgLysLysArgThrValValLeu 
3242 CTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTGGTCCTC 
GAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAG 

ThrGluSerThrLeuSerThrAlaLeuAlaGluLeuAlaThrArgSerPheGlySerSer 

3302 actgaatcaaccctatctactgccttggccgagctcgccaccagaAgctttggcagctcc 
tgacttagttgggatagatgacggaaccggctcgagcggtggtcttcgaaaccgtcgagg 

A A, 

3332 SACI, 3346 HIND3, 

SerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGluProAlaProSerGly 
3362 TCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCTGGC 
AGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGAAGACCG 

CysProProAspSerAspAlaGluSerTyrSerSerMetProProLeuGluGlyGluPro 
3422 TGCCCCCCCG ACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCT 
ACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTCGGA 

A 

3437 EAM11051, 

GlyAspProAspLeuSerAspGiySerTrpSerThrValSerSerGluAlaAsnAlaGlu 
3482 GGGGATCCGGATCTTAGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAACGCGGAG 
CCCCTAGGCCTA6AATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCCGGTTGC6CCTC 

AAA 

3484 BAMHI, 3485 BSABl, 3487 BSPEl, 

AspValValCysCysSerMetSerTyrSerTrpThrGlyAlaLeuValThrProCysAla 
354 2 GATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCGTGCGCC 
CTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACGCGG 



62/100 



wo 01/38360 



PCT/USOO/32326 



FIGURE 17. Page 7 



3589 DRAB, 3600 SAC2, ' 

* AlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHisHisAsn 
3602 'GCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTCACCACAAT 
CGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGTGTTA 

3611 ALWNl, 3655 PFLMl, 

LeuValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLysLysValThrPheAs? 
3662 TTGGTGTATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTTGAC 
AACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAG7GTAAAC7G 

3681 DRA3, 

ArgLeuGlnValLeuAspSerHisTyrGlnAspValLeuLysGiuValLysAlaAlaAla 
3722 AGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGC AGCGGCG 
TCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGTCGCCGC 

SerLysValLysAlaAsnLeuLeuSerValGluGiuAiaCysSerLeuThrProProHis 
3782 TCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTG ACGCCCC.CACAC 
AGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGGGGTGTG 

3816 HIND3, 

SerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAlaArgLysAla 
3842 TCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAGGCC 
AGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTCCGG 

3875 AAT2, 3890 BGLI, 

ValThrHisIleAsnSerValTrpLysAspLeuLeuGluAspAsnValThrProIleAsp 
3902 GTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAATAGAC 
CATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTTATCTG 

ThrThrlleMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGlyArgLys 
3962 ACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGTCGTAAG 
TGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCATTC 

ProAlaArgLeuIleValPheProAspLeuGlyValArgValCysGluLysMetAlaLeu 
4 022 CCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTG 
GGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGAAAC 

TyrAspValValThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPheGlnTyr 
4 082 TACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAATAC 
ATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGGTTATG 

SerProGlyGlnArgValGluPheLeuValGlnAlaTrpLysSerLysLysThrProMet 
4142 TCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATG 
AGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTC7TTTGGGGTTAC 

4160 ECORI, 

GlyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIleArgThr 
4202 GGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTGAGAGCGACATCCGTACG 
CCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAGGCATGC 
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4229 DRDl, 4236 ALWNl, 

GluGluAlalleTyrGlnCysCysAspLeuAspProGlnAlaArgValAlalieLysSer 
4262 GAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAGTCC 
CTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAGTTCAGG 

4301 BGLI, 4308 BALI, 

LeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsnCysGly 
4322 CTCACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGCGGC 
GAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTGACGCCG 

4 345 APAI, 

TyrArgArgCysArgAiaSerGlyValLeuThrThrSerCysGlyAsnThrLeuThrCys 
4 382 TATCGCAGGTGCCGCGCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACTTGC 
ATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGAACG 

TyrlleLysAlaArgAlaAlaCysArgAlaAiaGlyLeuGlnAspCysThrMetLeuVal 
4 4 42 TACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTG 
ATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCAC 

4 4 52 SMAI XMAI, 

CysGlyAspAspLeuValVallleCysGluSerAlaGlyValGlnGluAspAlaAlaSer 
4 502 TGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCGAGC 
ACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCG 

4508 DRDl, 4511 TTH3I, 

LeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGlyAspProProGln 
4 562 CTGAGAGCCTTC ACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCACAA 
GACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTT 

ProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAlaHisAsp 
4622 CCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCCCACGAC 
GGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTGCTG 

4 637 SACI, 

GlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAlaArgAla 
4 682 GGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGCT 
CCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCGA 

4731 NRUI, 

AlaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelleMeiPhe 
474 2 GCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATCATGTTT 
CGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAAA 

AlaProThrLeuTrpAlaArgMetlleLeuMetThrHisPhePheSerValLeuIleAla 
4802 GCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGCC 
CGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGG 

A A 

4806 PFLMl, 4807 DRA3, 

ArgAspGlnLeuGluGinAlaLeuAspCysGiuIleTyrGlyAlaCysTyrSerlleGlu 
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4862 AGGGACCAGCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACTCCATAGAA 
TCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCTT 

A 

4 893 BGL2, 

• ProLeuAspLeuProProIlelleGlnArgLeuHisGlyLeuSerAiaPheSerLeuHis 
4 922 CCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCACTCCAC 
GGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGTGAGGTG 

4 954 NCOI, 

SerTyrSerProGlyGluIleAsnArgVaiAlaAlaCysLeuArgLysLeuGlyValPro 
4 982 AGTTACTCTCCAGGTGAAATCAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGTACCG 
TCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCCCATGGC 

A A 

5015 SPHI, 5035 KPNI, 

ProLeuArgAlaTrpArgHisArgAlaArgSerValArgAlaArgLeuLeuAlaArgGly 
5042 CCCTTGCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCCAGAGGA 
GGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCCT 

A A 

5064 APAI, 5091 BALI, 

GlyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrpAlaValArgThrLysLeuLys 
5102 GGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAGCTCAAA 
CCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTCGAGTTT 

A 

. 5113 NDEI, 

LeuThrProIleAlaAlaAlaGlyGlnLeuAspLeuSerGlyTrpPheThrAlaGlyTyr 
5162 CTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCTGGCTAC 
GAGTGAGGTTATCGCCGGCGACCGGTCGACCTGAACAGGCCGACCAAGTGCCGACCGATG 

A A /\ A 

5174 NOTI, 5175 EAGl XMA3, 5182 BALI, 5186 PVU2, 

SerGlyGlyAspIleTyrHisSerValSerHisAlaArgProArgTrpIleTrpPheCys 
5222 AGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGGTTTTGC 
TCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACCAAAACG 

5240 DRA3, 

LeuLeuLeuLeuAlaAlaGlyValGlylleTyrLeuLeuProAsnArgMetSerThrAsn 
. ^ 5282 CTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGAATGAGCACGAAT 
GATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGGGTTGGCTTACTCGTGCTTA 

A 

5295 PSTI, 

ProLysProGlnArgLysThrLysArgAsnThrAsnArgArgProGinAspValLysPhe 
5342 CCTAAACCTCAAAGAAAGACCAAACGTAACACCAACCGGCGGCCGCAGGACGTCAAGTTC 
GGATTTGGAGTTTCTTTCTGGTTTGCATTGTGGTTGGCCGCCGGCGTCCTGCAGTTCAAG 

A A A A 

5380 NOTI, 5381 EAGl XMA3. 5390 AAT2, 5401 SMAI XMAI, 

ProGlyGlyGlyGlnlleValGlyGlyVaiTyrLeuLeuProArgArgGlyProArgLeu 
5402 CCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCTAGATTG 
GGCCCACCGCCAGTCTAGCAACCACCTCAAATGAACAACGGCGCGTCCCCGGGATCTAAC 
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5449 APAI, 

GlyValArgAlaThrArgLysThrSerGluArgSerGlnProArgGlyArgArgGlnPro 
5462 GGTGTGCGCGCGACGAGAAAGACTTCCGAGCGGTCGCAACCTCGAGGTAGACGTCAGCCT 
. . CCACACGCGCGCTGCTCTTTCTGAAGGCTCGCCAGCGTTGGAGCTCCATCTGCAGTCGGA 

5467 BSSH2, 5478 XMNI, 5502 XHOI, 5511 AAT2, 

IleProLysAlaArgArgProGluGlyArgThrTrpAlaGlnProGlyTyrProTrpPro 
5522 ATCCCCAAGGCTCGTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTACCCTTGGCCC 
TAGGGGTTCCGAGCAGCCGGGCTCCCGTCCTGGACCCGAGTCGGGCCCATGGGAACCGGG 

5548 ALWNl, 5558 ESPl, 5564 SMAI XMAI, 5568 KPNI, 

LeuTyrGlyAsnGluGlyCysGlyTrpAlaGlyTrpLeuLeuSerProArgGlySerArg 
5582 CTCT ATGGCAATGAGGGCTGCGGGTGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGG 
• GAGATACCGTTACTCCCGACGCCCACCCGCCCTACCGAGGACAGAGGGGCACCGAGAGCC 

ProSerTrpGlyProThrAspProArgArgArgSerArgAsnLeuGlyLysOC AM 
5642 CCTAGCTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGTAATAGTCG 
GGATCGACCCCGGGGTGTCTGGGGGCCGCATCCAGCGCGTTAAACCCATTCATTATCAGC 

5650 APAI, 5698 SALI, 



5702 AC 
TG 
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MetAlaAlaTyrAlaAlaGlnGlyTyrLysValLeuValLeuAsn 
2 AGCTTACAAAACAAAATGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTACTCAAC 
• TCGAATGTTTTGTTTTACCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTG 

1 HIND3, 24 NOEI, 52 SCAI, 

ProSerValAlaAlaThrLeuGlyPheGlyAlaTyrMetSerLysAlaHisGlylleAsp 
62 CCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGAT 
GGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGCTA 

116 CLAI. 

ProAsnlleArgThrGlyValArgThrlleThrThrGlySerProIleThrTyrSerThr 
122 CCT AACATCAGGACCGGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTACTCCACC 
GGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATGAGGTGG 

TyrGlyLysPheLeuAlaAspGlyGlyCysSerGlyGlyAlaTyrAspIlellelleCys 
182 TACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCTTATGACATAATAATTTGT 
ATGCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTATTAAACA 

AspGluCysHisSerThrAspAlaThrSerlleLeuGlylleGlyThrValLeuAspGin 
2 4 2 GACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATTGGCACTGTCCTTGACCAA 
CTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAACCGTGACAGGAACTGGTT 

AiaGiuThrAlaGlyAlaArgLeuValValLeuAlaThrAlaThrProProGlySerVal 
302 GCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGCTCCGTC 
CGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGGCAG 

303 ALWNl, 

ThrValProHisProAsnlleGluGluValAlaLeuSerThrThrGlyGluIleProPhe 
362 ACTGTGCCCCATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATCCCTTTT 
TGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGAAAA 

TyrGlyLysAlalleProLeuGluVallleLysGlyGlyArgHisLeuIlePheCysHis 
422 TACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTCTGTCAT 
ATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAGACAGTA 

SerLysLysLysCysAspGluLeuAlaAlaLysLeuValAlaLeuGlylleAsnAlaVal 
' 482 TCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAATGCCGTG 
AGTTTCTTCTTCACGCTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTACGGCAC 

AlaTyrTyrArgGlyLeuAspValSerVailleProThrSerGlyAspValValValVal 
542 GCCTACTACCGCGGTCTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTCGTCGTG 
CGGATGATGGCGCCAGAACTGCACAGGCAGTACGGCTGGTCGCCGCTACAACAGCAGCAC 

A A 

550 SAC2, 560 DRDl, 

AlaThrAspAlaLeuMetThrGlyTyrThrGlyAspPheAspSerVallleAspCysAsn 
602 GCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGACTGCAAT 
CGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTGACGTTA 
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ThrCysValThrGlnThrValAspPheSerLeuAspProThrPheThrlieGitiThrZie 
. 662 ACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATTGAGACAATC 
TGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTCTGTTAG 

... ThrLeuProGlnAspAlaValSerArgThrGlnArgArgGlyArgThrGlyArgGlyLys 
722 ACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGGAAG 
TGCGAGGGGGTTCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCCCCCTTC 

ProGlylleTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAspSerSer 
782 CCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACTCGTCC 
GGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGCAGG 

A A 

816 BGLI, 833 DROl, 

ValLeuCysGluCysTyrAspAlaGlyCysAlaTrpTyrGluLeuThrProAlaGluThr 
842 GTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCCGAGACT 
CAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTCTGA 

A 

881 SACI, 

ThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAspHisLeu 
902 ACAGTTAGGCT ACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTT 
TGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAA 

A 

931 SMAI XMAI, 

GluPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHisPheLeuSerGln 
962 GAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAG 
CTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGGTC 

985 STUI, 

ThrLysGlnSerGlyGluAsnLeuProTyrLeuValAlaTyrGlnAlaThrValCysAla 
1022 ACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCT 
TGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGA 

A 

1069 DRA3, 

ArgAlaGlnAlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArgLeuLys 
1082 AGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGCCTCAAG 
TCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCGGAGTTC 

ProThrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAiaValGlnAsnGluIie 
1142 CCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAATGAAATC 
GGGTGGGAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTACTTTAG 

A 

1150 NCOI, 

ThrLeuThrHisProValThrLysTyrlleMetThrCysMetSerAlaAspLeuGiuVal 
1202 ACCCTGACGCACCCAGTCACCAAATACATCATGACATGCATGTCGGCCGACCTGGAGGTC 
TGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGACCTCCAG 

AAA A . A 

1230 BSPHl, 1234 DRDl, 1237 AVA3/ 1245 EAGl XMA3, 1250 DRDl, 



ValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeuAlaAlaTyrCysLeu 
1 2 62 GTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGCCTG 



68/100 



wo 01/38360 



PCT/USOO/32326 



FIGURE 18 -Page 3 

CAGTGCTCGTGGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCA7AACGGAC 

SerThrGlyCysValValllfeValGlyArgValValLeuSerGlyLysProAlalielle 
1322 TCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATCATA 
AGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAGTAT 

1369 NAEI, 

ProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGlnHisLeu 
1382 CCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACTTA 
GGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTGAAT 

A 

1385 DRDl, 

ProTyrlleGluGlnGlyMetMetLeuAlaGluGlnPheLysGlnLysAlaLeuGlyLeu 
14 4 2 CCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCGGCCTC 
GGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAG 

LeuGinThrAlaSerArgGlnAlaGluVallleAlaProAlaValGlnThrAsnTrpGln 
1502 CTGCAGACCGCGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAACTGGCAA 
GACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACCGTT 

1502 PSTI, 1507 TTH3I, 

LysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSerGlylleGlnTyrLeu 
1562 AAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAATACTTG 
TTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATGAAC 

1565 XHOI, 1586 NDEI, 

AlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeuMetAlaPheThrAla 
1622 GCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTTACAGCT 
CGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGTCGA 

A A 

1643 BSTE2. 1677 ALWNl PVU2, 

AlaValThrSerProLeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGlyGlyTrp 
1682 GCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGG 
CGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACC 

ValAlaAlaGlnLeuAlaAlaProGlyAlaAlaThrAlaPheValGlyAlaGlyLeuAla 
1742 GTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTAGCT 
CACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAATCGA 

A 

1794 ESPl, 

GlyAlaAlalleGlySerValGlyLeuGlyLysValLeuIleAspIleLeuAlaGlyTyr 
1302 GGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGGTAT 
CCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCCATA 

A 

1802 KASl NARI, 

GlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMetSerGlyGluValProSer 
18 62 GGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTCCCCTCC 
CCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGG 

A A 

1878 SACI, 1899 BSPHl, 



69/100 



wo 01/38360 PCT/USOO/32326 



FIGURE 18 -Page 4 



ThrGluAsoLeuValAsnLeuLeuProAlalleLeuSerProGlyAlaLeuValVaiGly 
1922 ACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTAGTCGGC 
TGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAGCCG 

1928TTH3I, 

ValVaiCysAlaAlalleLeuArgArgHisValGlyProGlyGluGlyAlaValGlnTrp 
1982 GTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGG 
CACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACC 

2004 NAEI, 2017 SMAI XMAI, 

MetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSerProThrHisTyrVal 
2042 ATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGCACT ACG7G 
TACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCAC 

A ^ 

2067 SMAI XMAI, 2093 DRA3, 

ProGluSerAspAlaAlaAlaArgValThrAlalleLeuSerSerLeuThrValThrGln 
2102 CCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACCCAG 
GGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACATTGGGTC 

A 

2115 PVU2, 2159 ALWNl, 

LeuLeuArgArgLeuHisGlnTrpIleSerSerGluCysThrThrProCysSerGlySer 
2162 CTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGTTCC 
GAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGGCCAAGG 

2164 MST2, 2220 ECONl, 

TrpLeuArgAspIieTrpAspTrpIleCysGluValLeuSerAspPheLysThrTrpLeu 
2222 TGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACCTGGCTA 
ACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGGACCGAT 

LysAlaLysLeuMetProGlnLeuProGlylleProPheValSerCysGlnArgGlyTyr 
2282 AAAGCTAAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGGTAT 
TTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATA 

A A A 

2285 ESPl, 2300 PVU2, 2310 BAMHI, 

LvsGlyValTrpArgGlyAspGlylleMetHisThrArgCysHisCysGlyAlaGluIle 
r 2342 AAGGGGGTCTGGCGAGGGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAGATC 
TTCCCCCAGACCGCTCCCCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGACTCTAG 

ThrGlyHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArgAsnMet 
2402 ACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATG 
TGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTGTAC 

/-> AAA 

2425 BSABl, 2441 AVR2/ 2448 SSE83871, 2449 PSTI, 

TrpSerGlyThrPheProIleAsnAlaTyrThrThrGlyProCysThrProLeuProAla 
24 62 TGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCTGCG 
ACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGACGC 

A ^ 

2480 AS£1, 2497 APAI, 
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ProAsnTyrThrPheAlaLeuTrpArgValSerAlaGluGluTyrValGluIieArgGln 
2522 CCGAACT ACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGAATACGTGGAGA7AAGGCAG 
GGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTATGCACCTCTATTCCGTC 

•'2553 PSTI, 

ValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysProCysGin 
25 8 2 GTGGGGGACTTCCACTACGTGACGGGTATGACTACTGACAATCTTAAATGCCCGTGCCAG 
CACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAGAATTTACGGGCACGGTC 

2594 DRA3, 

ValProSerProGiuPhePheThrGluLeuAspGlyValArgLeuHisArgPheAiaPro 
2642 GTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCC 
CAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAACGCGGG 

ProCvsLysProLeuLeuArgGluGluValSerPheArgValGlyLeuHisGluTyrPro 
2702 CCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAATACCCG 
GGGACG7TCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTTATGGGC 

2757 HGIE2, 

ValGlySerGlnLeuProCysGluProGluProAspValAlaValLeuThrSerMetLeu 
2762 GTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCATGCTC 
CATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTACGAG 

2809 AAT2, 

ThrAspProSerHisIleThrAlaGluAlaAlaGlyArgArgLeuAlaArgGlySerPro 
2822 ACTGATCCCTCCCATATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCACCC 
TGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCTAGTGGG 

2850 EAGl XMA3, 

ProSerValAlaSerSerSerAlaSerGlnLeuSerAlaProSerLeuLysAlaThrCys 
2882 CCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCAACTTGC 
GGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGTTGAACG 

2889 BALI, 2903 NHEI, 

ThrAlaAsnHisAspSerProAspAlaGluLeuIleGluAlaAsnLeuLeuTrpArgGln 
2942 ACCGCT AACC ATG ACTCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGGAGGCAG 
TGGCGATTGGTACTGAGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACCTCCGTC 

2966 ESPl, 2969 SACI, 

GluMetGlyGlyAsnlleThrArgValGiuSerGluAsnLysValVallleLeuAspSer 
3002 GAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTGGACTCC 
CTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGACCTGAGG 

PheAspProLeuValAlaGluGluAspGluArgGluIleSerValProAlaGluIleLeu 
3062 TTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAAATCCTG 
AAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTTTAGGAC 

A 

3096 BGL2, 

ArgLysSerArgArgPheAiaGlnAlaLeuProValTrpAlaArgProAspTyrAsnPro 
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3122 CGGAAGTCTCGGAGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACrATAACCCC 
GCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATTGGGG 

3143 ALWNl, 3164 EAGl XMA3, 

ProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGlyCysPro 
3182 CCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGCCCG 
GGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACGGGC 

3217 HGIE2, 3229 NCOI, 

LeuProProProLysSerProProValProProProArgLysLysArgThrValValLeu 
3242 CTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTGGTCCTC 
GAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAG 

ThrGluSerThrLeuSerThrAlaLeuAlaGluLeuAlaThrArgSerPheGlySerSer 
3302 ACTGAATCAACCCTATCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGCAGCTCC 
TGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCGAGG 

3332 SACI, 3346 HIND3, 

SerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGluProAlaProSerGly 
3362 TCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCTGGC 
AGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGAAGACCG 

CysProProAspSerAspAlaGluSerTyrSerSerMetProProLeuGluGlyGluPro 
34 22 TGCCCCCCCGACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCT 
ACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTCGGA 

3437 EAM11051, 

GlyAspProAspLeuSerAspGlySerTrpSerThrValSerSerGluAlaAsnAlaGlu 
3482 GGGGATCCGGATCTTAGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAACGCGGAG 
CCCCTAGGCCTAGAATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCCGGTTGCGCCTC 

AAA 

3484 BAMHI, 3485 BSABl, 3487 BSPEl, 

AspValValCysCysSerMetSerTyrSerTrpThrGlyAlaLeuValThrProCysAla 
354 2 GATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCGTGCGCC 
CTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACGCGG 

A ^ 

3589 DRA3, 3600 SAC2, 

AlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHisHisAsn 
3602 GCGGAAGAACAGAAACTGCCCATCAATGCACT AAGCAACTCGTTGCTACGTCACCACAAT 
CGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGTGTTA 

A ^ 

3611 ALWNl, 3655 PFLMl, 

LeuValTyrSerThrThrSerArgSerAlaCysGinArgGlnLysLysValThrPheAsp 
3662 TTGGTGTATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTTGAC 
AACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAGTGTAAACTG 

A* 

3681 DHA3, 

ArgLeuGlnValLeuAspSerHisTyrGlnAspValLeuLysGluValLysAlaAlaAla 
3722 AGACTGCAAGTTCTGGACAGCCATTACCAGGACGT ACTCAAGGAGGTTAAAGCAGCGGCG 
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TCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGTCGCCGC 

SerLysValLysAlaAsnLeuLeuSerValGiuGluAlaCysSerLeuThrProProHis 
3782 TCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCCCCACAC 
;^CTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGGGGTGTG 

3816 HIND3, 

SerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAlaArgLysAia 
3842 TCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAGGCC 
AGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTCCGG 

A ^ 

3875 AAT2r 3890 BGLI, 

ValThrHisIleAsnSerVaiTrpLysAspLeuLeuGiuAspAsnValThrProIleAsp 
3902 GTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAATAGAC 
CATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTTATCTG 

ThrThrlleMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGlyArgLys 
3962 ACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGTCGTAAG 
TGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCATTC 

ProAlaArgLeuIleValPheProAspLeuGlyValArgValCysGluLysMetAlaLeu 
4022 CCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTG 
GGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGAAAC 

TyrAspValValThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPheGlnTyr 
4082 TACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAATAC 
ATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGGTTATG 

SerProGlyGlnArgValGluPheLeuValGlnAlaTrpLysSerLysLysThrProMet 
414 2 TCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATG 
AGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGGGGTTAC 

4160 ECORI, 

GlyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIieArgThr 
4 202 GGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTG AGAGCGACATCCGTACG 
CCCAAGAGCATAC7ATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAGGCATGC 

A A 

4229 DRDl, 4236 ALWNl, 

GluGluAlalleTyrGinCysCysAspLeuAspProGlnAlaArgValAlalleLysSer 
4262 GAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAGTCC 
CTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAGTTCAGG 

A A 

4 301 BGLI, 4 308 BALI, 

LeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsnCysGly 
4 322 CTCACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGCGGC 
GAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTGACGCCG 

A 

4345 APAI, 

TyrArgArgCysArgAlaSerGlyValLeuThrThrSerCysGlyAsnThrLeuThrCys 
4 382 TATCGCAGGTGCCGCGCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACTTGC 
ATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGAACG 
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TyrlleLysAlaArgAlaAlaCysArgAlaAlaGlyLeuGlnAspCysThrMetLeuVai 
4442 T ACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTG 
ATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCAC 

4452 SMAI XMAI, 

CysGlyAspAspLeuValVallleCysGluSerAlaGlyValGlnGluAspAlaAlaSer 
4502 TGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCGAGC 
ACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCG 

4508 DRDl, 4511 TTH3I, 

LeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGlyAspProProGln 
4562 CTGAGAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCACAA 
GACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTT 

ProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAlaHisAsp 
4 622 CCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCCCACGAC 
GGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTGCTG 

4 637 SACI, 

GlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAiaArgAla 
4 682 GGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGCT 
CCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCGA 

4731 NROI, 

AlaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelleMetPhe 
4 742 GCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATCATGTTT 
CGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAAA 

AlaProThrLeuTrpAlaArgMetlleLeuMetThrHisPhePheSerValLeuIleAla 
4802 GCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGCC 
CGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGG 

4806 PFLMl, 4807 DRA3, 

ArgAspGlnLeuGiuGlnAlaLeuAspCysGluIleTyrGlyAlaCysTyrSerlleGlu 
4 862 AGGGACCAGCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACTCCATAGAA 
TCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCTT 

A 

4893 BGL2, 

ProLeuAspLeuProProIlelleGlnArgLeuHisGlyLeuSerAlaPheSerLeuHis 
4 922 CCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCACTCCAC 
GGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGTGAGGTG 

A 

4 954 NCOI, 

SerTyrSerProGlyGluIleAsnArgValAiaAlaCysLeuArgLysLeuGlyValPro 
4 982 AGTTACTCTCCAGGTGAAATCAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGTACCG 
TCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCCCATGGC 

5015 SPHI, 5035 KPNI, 
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ProLeuArgAlaTrpArgHisArgAlaArgSerValArgAlaArgLeuLeiiAlaArgGly 
5042 CCC7TGCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCCAGAGGA 
GGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCCT 

A A 

5064 APAI, 5091 BALI, 

GiyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrpAlaValArgThrLysLeuLys 
5102 GGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAGCTCAAA 
CCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTCGAGTTT 

5113 NDEI, 

LeuThrProIleAlaAlaAlaGlyGlnLeuAspLeuSerGlyTrpPheThrAlaGlyTyr 
5162 CTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCTGGCTAC 
GAGTGAGGTTATCGCCGGCGACCGGTCGACCTGAACAGGCCGACCAAGTGCCGACCGATG 

A A A A 

5174 NOTI, 5175 EAGl XMA3, 5182 BALI, 5186 PVU2, 

SerGlyGlyAspIleTyrHisSerValSerHisAlaArgProArgTrpIleTrpPheCys 
5222 AGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGGTTTTGC 
TCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACCAAAACG 

5240 DRA3, 

LeuLeuLeuLeuAlaAlaGlyValGlylleTyrLeuLeuProAsnArgMetSerThrAsn 
5282 CTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGAATGAGCACGAAT 
GATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGGGTTGGCTTACTCGTGCTTA 

A 

5295 PSTI, 

ProLysProGlnArgLysThrLysArgAsnThrAsnArgArgProGlnAspValLysPhe 
534 2 CCTAAACCTCAAAGAAAGACCAAACGTAACACCAACCGGCGGCCGCAGGACGTCAAGTTC 
GGATTTGGAGTTTCTTTCTGGTTTGCATTGTGGTTGGCCGCCGGCGTCCTGCAGTTCAAG 

A A A A 

5380 NOTI, 5381 EAGl XMA3, 5390 AAT2, 5401 SMAI XMAI, 

ProGlyGlyGlyGlnlleValGlyGlyValTyrLeuLeuProArgArgGlyProArgLeu 
5402 CCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCTAGATTG 
GGCCCACCGCCAGTCTAGCAACCACCTCAAATGAACAACGGCGCGTCCCCGGGATCTAAC 

A 

5449 APAI, 

GlyValArgAlaThrArgLysThrSerGluArgSerGlnProArgGlyArgArgGlnPro 
54 62 GGTGTGCGCGCGACGAGAAAGACTTCCGAGCGGTCGCAACCTCGAGGTAGACGTCAGCCT 
CCACACGCGCGCTGCTCTTTCTGAAGGCTCGCCAGCGTTGGAGCTCCATCTGCAGTCGGA 

A A A A 

5467 BSSH2, 5478 XMNI, 5502 XHOI, 5511 AAT2, 

lieProLysAlaArgArgProGluGlyArgThrTrpAlaGlnProGiyTyrProTrpPro 
5522 ATCCCCAAGGCTCGTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTACCCTTGGCCC 
TAGGGGTTCCGAGCAGCCGGGCTCCCGTCCTGGACCCGAGTCGGGCCCATGGGAACCGGG 

A AAA 

5548 ALWNl, 5558 ESPl, 5564 SMAI XMAI, 5568 KPNI, 

LeuTyrGlyAsnGluGlyCysGlyTrpAlaGlyTrpLeuLeuSerProArgGlySerArg 
5582 CTCTATGGCAATGAGGGCTGCGGGTGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGG 
GAGATACCGTTACTCCCGACGCCCACCCGCCCTACCGAGGACAGAGGGGCACCGAGAGCC 
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ProSerTrpGlyProThrAspFroArgArgArgSerArgAsnLeuGlyLysVallleAsp 
• 5 6 4 2 CCTAGCTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGCAATTTGGGT AAGGTCATCGAT 
GGATCGACCCCGGGGTGTCTGGGGGCCGCATCCAGCGCGTTAAACCCATTCCAG7AGCTA 

56Sa APAI, 5696 CLAI, 

ThrLeuThrCysGlyPheAlaAspLeuMetGlyTyrlleProLeuValGlyAlaProLeu 
'5702 ACCCTTACGTGCGGCTTCGCCGACCTCATGGGGTACATACCGCTCGTCGGCGCCCCTCTT 
TGGGAATGCACGCCGAAGCGGCTGGAGTACCCCATGTATGGCGAGCAGCCGCGGGGAGAA 

5724 HGIE2, 5750 KASl NARI, 5756 ECONl, 

GiyGlyAlaAlaArgAlaLeuAlaHisGlyValArgVaiLeuGluAspGlyValAsnTyr 
5762 GGAGGCGCTGCCAGGGCCCTGGCGCATGGCGTCCGGGTTCTGGAAGACGGCGTGAACTAT 
CCTCCGCGACGGTCCCGGGACCGCGTACCGCAGGCCCAAGACCTTCTGCCGCACTTGATA 

5772 BSTXI, 5775 APAI, 

AlaThrGlyAsnLeuProGlyCysSerOC AM 
5822 GCAACAGGGAACCTTCCTGGTTGCTCTTAAT AGTCG AC 
CGTTGTCCCTTGGAAGGACCAACGAGAATTATCAGCTG 

5854 SALI, 
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MecAlaAiaTyrAiaAiaGlnGiyTyrLysVaiLeuVaiLeuAsn 
2 AGCTTACAAAACAAAATGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTACTCAAC 
TCGAATGTTTTGTTTTACCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTG 

1 HIND3, 24 NDEI, 52 SCAI, 

ProSerValAlaAlaThrLeuGiyPheGlyAlaTyrMetSerLysAlaHxsGiylleAso 
62 CCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGAT 
GGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGC7A 

116 CLAI, 

ProAsnlleArgThrGlyValArgThrlleThrThrGlySerProIleThrTyrSerThr 
122 CCTAACATCAGGACCGGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTACTCCACC 
GGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATGAGGTGG 

TyrGlyLysPheLeuAlaAspGlyGlyCysSerGlyGlyAiaTyrAspIlellelleCys 
182 T ACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCTT ATGACAT AATAATTTGT 
ATGCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTATTAAACA 

AspGluCysHisSerThrAspAlaThrSerlieLeuGlylleGlyThrValLeuAspGIn 
- 242 GACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATTGGCACTGTCCTTGACCAA 
CTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAACCGTGACAGGAACTGGTT 

AlaGluThrAiaGlyAlaArgLeuValValLeuAlaThrAiaThrProProGlySerVal 
302 GCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGCTCCGTC 
CGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGGCAG 

303 ALWNl, 

ThrVaiProHisProAsnlieGluGluValAlaLeuSerThrThrGlyGluIleProPhe 
362 ACTGTGCCCC ATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATCCCTTTT 
TGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGAAAA 

TyrGlyLysAlalleProLeuGluVallleLysGlyGlyArgHisLeuIlePheCysHis 
422 TACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTCTGTCAT 
ATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAGACAGTA 
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SerLysLysLysCysAspGluL^uAlaAlaLysLeuValAlaLeuGlylleAsnAlaVal 
48? TCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAATGCCGTG 
,AGTTTCTTCTTCACGCTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTACGGCAC 

AlaTyrTyrArgGlyLeuAspValSerVallleProThrSerGlyAspValValValVal 
542 GCCT ACTACCGCGGTCTTGACGTGTCCGTCATCCCGACC AGCGGCGATGTTGTCGTCGTG 
CGGATGATGGCGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAGCAGCAC 

550 SAC2, 560 DRDl, 

AiaThrAspAlaLeuMetThrGlyTyrThrGlyAspPheAspSerVallieAspCysAsn 
602 GCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGACTGCAAT 
CGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCGACTATCTGACGTTA 

615 BSPHl, 

ThrCysValThrGlnThrValAspPheSerLeuAspProThrPheThrlleGluThrlle 
662 ACGTGTGTCACCCAGACAGTC6ATTTCAGCCTTGACCCTACCTTCACCATTGAGACAATC 
TGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTCTGTTAG 

ThrLeuProGlnAspAIaValSerArgThrGlnArgArgGlyArgThrGlyArgGlyLys 
722 ACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGGAAG 
TGCGAGGGGGTTCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCCCCCTTC 

ProGlylleTyrArgPheValAlaProGlyGluArgProSerGIyMetPheAspSerSer 
782 CCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACTCGTCC 
GGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGCAGG 

816 BGLI, 833 DROl, 

ValLeuCysGluCysTyrAspAlaGlyCysAlaTrpTyrGluLeuThrProAIaGluThr 
84 2 GTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCCGAGACT 
CAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGA6TGCGGGCGGCTCTGA 

881 SACI, 

ThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAspHisLeu 
902 ACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTT 
TGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAA 

A 

931 SMAI XMAI, 

GluPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHisPheLeuSerGln 
962 GAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAG 
CTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGGTC 

A 

985 S7V1, 

ThrLysGlnSerGIyGluAsnLeuProTyrLeuValAlaTyrGlnAlaThrValCysAla 
1022 ACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCT 
TGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGA 

A 

1069 DRA3, 

ArgAlaGlnAlaProProProSerTrpAspGinMetTrpLysCysLeuIleArgLeuLys 
1082 AGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGCCTCAAG 
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TCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTAA'GCGGAGTTC 

ProThrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAlaValGlnAsnGluIle 
1142 CCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAA7GAAATC 
^ GGGTGGGAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTACTTTAG 

A 

1150 NCOI, 

ThrLeuThrHisProValThrLysTyrlleMetThrCysMetSerAlaAspLeuGluVal 
1202 ACCCTGACGCACCCAGTCACCAAATACATCATGACATGCATGTCGGCCGACCTGGAGGTC 
TGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGACCTCCAG 

1230 BSPHl, 1234 DRDl, 1237 AVA3, 1245 EAGl XMA3, 1250 DROl, 



ValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeuAlaAlaTyrCysLeu 
1262 GTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGCCTG 
CAGTGCTCGTGGACCCACGAGCAACC6CCGCAGGACCGACGAAACCGGCGCATAACGGAC 

SerThrGlyCysValVallleValGlyArgValValLeuSerGlyLysProAlallelle 
1322 TCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATCAT A 
AGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAGTAT 

A 

1369 NAEI, 

ProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGlnHisLeu 
1382 CCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACTTA 
GGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTGAAT 

A 

1385 DRDl, 

ProTyrlleGluGlnGlyMetMetLeuAlaGluGlnPheLysGlnLysAlaLeuGlyLeu 
1442 CCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCGGCCTC 
GGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAG 

LeuGlnThrAlaSerArgGlnAlaGluVallieAlaProAlaValGlnThrAsnTrpGln 
1502 CTGCAGACCGCGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAACTGGCAA 
GACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACCGTT 

A A 

1502 PSTI, 1507 TTH3I, 

LysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSerGlylleGlnTyrLeu 
r 1562 AAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAATACTTG 
TTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATGAAC 

^ A 

1565 XHOI, 1586 NDEI, 

AlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeuMetAlaPheThrAla 
1622 GCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTTACAGCT 
CGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGTCGA 

A /N 

164 3 BSTE2, 1677 ALWNl PVU2, 

AlaValThrSerProLeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGlyGlyTrp 
1682 GCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGG 
CGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACC 
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ValAlaAlaGlnLeuAlaAlaProGlyAlaAlaThrAlaPheValGlyAlaGlyLeuAia 
1742 GTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTAGCT 
CACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAATCGA 

1794 ESPl, 

GlVAlaAlalleGlySerValGlyLeuGlyLysValLeuIleAspIleLeuAlaGlyTyr 
1802 GGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGGTAT 
CCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCCATA 

1802 KASl NARI, 

GlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMetSerGiyGluValProSer 
1862 GGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTCCCCTCC 
CCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGG 

1878 SACI, 1899 BSPHl, 

ThrGluAspLeuValAsnLeuLeuProAlalleLeuSerProGlyAlaLeuValValGly 
1922 ACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTAGTCGGC 
TGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAGCCG 

1928 TTH3I, 

ValValCysAlaAlalleLeuArgArgHisValGlyProGlyGluGlyAlaValGlnTrp 
1982 GTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGG 
CACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACC 

2004 NAEI, 2017 SMAI XMAI, 

MetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSerProThrHisTyrVal 
2042 ATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGC ACTACGTG 
TACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCAC 

A A 

2067 SMAI XMAI, 2093 DRA3, 

ProGluSerAspAiaAlaAlaArgValThrAlalleLeuSerSerLeuThrValThrGin 
2102 CCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACCCAG 
GGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACATTGGGTC 

A A 

2115 PVU2, 2159 ALWNl, 

LeuLeuArgArgLeuHisGinTrpIleSerSerGluCysThrThrProCysSerGlySer 
2162 CTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGTTCC 
GAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGGCCAAGG 

2164 MST2, 2220 ECONl, 

TrpLeuArgAspIleTrpAspTrpIleCysGluValLeuSerAspPheLysThrTrpLeu 
2222 TGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACCTGGCTA 
ACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGGACCGAT 

LysAlaLysLeuMetProGlnLeuProGlylleProPheValSerCysGlnArgGlyTyr 
2282 AAAGCT AAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGGT AT 
TTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATA 

A A A 

2285 ESPl, 2300 PVU2, 2310 BAMHI, 
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LysGlyValTrpArgGlyAspGlylleMetHisThrArgCysHisCysGlyAiaGluIie 
2342 AAGGGGGTCTGGCGAGGGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAGATC 
TTCCCCCAGACCGCTCCCCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGACTCTAG 

ThrGlyHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArgAsnMet 
2402 AC'TGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATG 
TGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTGTAC 

2425 BSABl, 2441 AVR2, 2448 SSE83871, 2449 PSTI, 

TrpSerGlyThrPheProIleAsnAlaTyrThrThrGlyProCysThrProLeuProAla 
2462 TGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCTGCG 
ACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGACGC 

2480 ASEl, 2497 APAI, 

ProAsnTyrThrPheAlaLeuTrpArgValSerAlaGluGluTyrValGluIleArgGln 
2522 CCGAACTACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGAATACGTGGAGATAAGGCAG 
GGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTATGCACCTCTATTCCGTC 

2553 PSTI, 

ValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysProCysGln 
2582 GTGGGGGACTTCCACTACGTGACGGGTATGACTACTGACAATCTTAAATGCCCGTGCCAG 
CACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAGAATTTACGGGCACGGTC 

2594 DRA3« 

ValProSerProGluPhePheThrGluLeuAspGlyValArgLeuHisArgPheAlaPro 
264 2 GTCCC ATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCC 
CAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAACGCGGG 

ProCysLysProLeuLeuArgGluGluValSerPheArgValGlyLeuHisGiuTyrPro 
2702 CCCTGC AAGCCCTTGCTGCGGGAGGAGGT ATCATTCAGAGTAGGACTCC ACG AATACCCG 
GGGACGTTCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTTATGGGC 

2757 HGIE2, 

ValGlySerGlnLeuProCysGluProGluProAspValAlaValLeuThrSerMetLeu 
2762 GTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCATGCTC 
CATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTACGAG 

A 

2809 AAT2, 

ThrAspProSerHisIleThrAlaGiuAlaAlaGlyArgArgLeuAlaArgGlySerPro 
2822 ACTGATCCCTCCCATATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCACCC 
TGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCtTCCAACCGCTCCCCTAGTGGG 

A 

2850 EAGl XMA3, 

ProSerValAlaSerSerSerAlaSerGlnLeuSerAlaProSerLeuLysAlaThrCys 
2882 CCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCAACTTGC 
GGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGTTGAACG 

A A 

2889 BALI, 2903 NHEI, 
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ThrAlaAsnHisAspSerProAspAlaGluLeuIleGluAlaAsnLeuLeuTroArgGin 
2942 ACCGCTAACCATGACTCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGGAGGCAG 
TGGCGATTGGTACTGAGGGGACTACGACTCGA6TATCTCCGGTTGGAGGATACCTCCGTC 

... 2966 ESPl, 2969 SACI, 

GluMetGlyGlyAsnlleThrArgValGluSerGluAsnLysVaiVallleLeuAspSer 
3002 GAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTGGACTCC 
CTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGACCTGAGG 

PheAspProLeuValAlaGluGluAspGluArgGluIleSerValProAlaGluIleLeu 
3062 TTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAAATCC7G 
AAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTTTAGGAC 

3096 BGL2, 

ArgLysSerArgArgPheAlaGlnAlaLeuProValTrpAlaArgProAspTyrAsnPro 
3122 CGGAAGTCTCGGAGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTATAACCCC 
GCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATTGGGG 

3143 ALWNl, 3164 EAGl XMA3, 

ProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGlyCysPro 
3182 CCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGCCCG 
GGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACGGGC 

3217 HGIE2, 3229 NCOI, 

LeuProProProLysSerProProValProProProArgLysLysArgThrValValLeu 
3242 CTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTGGTCCTC 
GAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAG 

ThrGluSerThrLeuSerThrAlaLeuAlaGluLeuAlaThrArgSerPheGlySerSer 
3302 ACTGAATCAACCCTATCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGCAGCTCC 
TGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCGAGG 

A A 

3332 SACI, 3346 HIND3, 

SerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGluProAlaProSerGly 
3362 TCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCTGGC 
AGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGAAGACCG 

CysProProAspSerAspAlaGluSerTyrSerSerMetProProLeuGluGlyGluPro 
3422 TGCCCCCCCGACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCT 
ACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTCGGA 

A 

3437 EAM11051, 

GlyAspProAspLeuSerAspGlySerTrpSerThrValSerSerGluAlaAsnAlaGlu 
3482 GGGGATCCGGATCTTAGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAACGCGGAG 
CCCCTAGGCCTAGAATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCCGGTTGCGCCTC 

AAA 

3484 BAMHI, 3485 BSABl, 3487 BSPEl, 

AspValValCysCysSerMetSerTyrSerTrpThrGlyAlaLeuValThrProCysAla 
3542 GATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCGTGCGCC 
CTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACGCGG 
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3589 DRA3, 3600 SAC2, ~ 

AlaGIuGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHisHisAsn 
3602.. GCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTCACCACAAT 
CGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGTGTTA 

3611 ALWNl, 3655 PFLMl, 

LeuValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLysLysValThrPheAsp 
3662 TTGGTGTATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTTGAC 
AACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAGTGTAAACTG 

A 

3681 DRA3, 

ArgLeuGlnValLeuAspSerHisTyrGlnAspValLeuLysGluValLysAlaAlaAla 
3722 AGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCAGCGGCG 
TCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGTCGCCGC 

SerLysValLysAlaAsnLeuLeuSerValGluGluAlaCysSerLeuThrProProHis 
3782 TCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCCCCACAC 
AGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGGGGTGTG 

A 

3816 HIND3, 

SerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAlaArgLysAla 
384 2 TCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAGGCC 
AGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTCCGG 

A A 

3875 AAT2, 3890 BGLI, 

ValThrHisIleAsnSerValTrpLysAspLeuLeuGluAspAsnValThrProIleAsp 
3902 GTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAAT AGAC 
CATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTTATCTG 

ThrThrlieMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGlyArgLys 
3962 ACT ACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGTCGTAAG 
TGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCATTC 

ProAlaArgLeuIleValPheProAspLeuGlyValArgValCysGluLysMetAlaLeu 
4022 CCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTG 
GGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGAAAC 

TyrAspValValThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPheGlnTyr 
4 082 TACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAATAC 
ATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGG-ATGCCTAAGGTTATG 

SerProGlyGlnArgValGluPheLeuValGlnAlaTrpLysSerLysLysThrProMet 
4142 TCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATG 
AGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGGGGTTAC 

A 

4160 ECORI, 

GlyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIleArgThr 
4202 GGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTGAGAGCGACATCCGTACG 
CCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAGGCATGC 
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4229 DRDl, 4236 ALWNl, 

GluGluAlalleTyrGlnCy^CysAspLeuAspProGlnAlaArgValAlalleLysSer 
4262 GAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAGTCC 
CTCCTCCGTTAGATGGTT ACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAGTTCAGG 

A A 

4301 BGLI, 4308 BALI, 

LeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsnCysGiy 
4322 CTC ACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGCGGC 
GAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTGACGCCG 

A 

4 345 APAI, 

TyrArgArgCysArgAlaSerGlyValLeuThrThrSerCysGlyAsnThrLeuThrCys 
4 382 TATCGCAGGTGCCGCGCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACTTGC 
ATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGAACG 

TyrlleLysAlaArgAlaAlaCysArgAlaAlaGlyLeuGlnAspCysThrMetLeuVal 
4 4 4 2 TACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTG 
ATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCAC 

A 

4452 SMAI XMAI, 

CysGlyAspAspLeuVaiVallleCysGluSerAlaGlyValGlnGluAspAlaAIaSer 
4 502 TGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCGAGC 
ACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCG 

A A 

4508 DRDl, 4511 TTH3I, 

LeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGiyAspProProGln 
4562 CTGAGAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCACAA 
GACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTT 

ProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAlaHisAsp 
4622 CCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCCCACGAC 
GGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTGCTG 

A 

4 637 SACI, 

GlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAlaArgAla 
4682 GGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGCT 
CCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCGA 

4731 NRUI, 

AlaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelieMetPhe 
4 742 GCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATC ATGTTT 
CGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAAA 

AlaProThrLeuTrpAlaArgMetlleLeuMetThrHisPhePheSerValLeuIleAla 
4 802 GCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGCC 
CGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGG 

A A 

4806 PFLMl, 4807 DRA3, 

ArgAspGlnLeuGluGlnAlaLeuAspCysGluIieTyrGlyAlaCysTyrSerlleGlu 
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4 8 62 AGGGACCAGCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACTCCATAGAA 
TCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCT? 

4893 BGL2, 

*" ProLeuAspLeuProProIlelleGlnArgLeuHisGlyLeuSerAlaPheSerLeuKis 
4 922 CCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCACTCCAC 
GGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGTGAGGTG 

4954 NCOI, 

SerTyrSerProGlyGluIleAsnArgValAlaAlaCysLeuArgLysLeuGlyValPro 
4 982 AGTT ACTCTCCAGGTGAAATCAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGT ACCG 
TCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCCCATGGC 

5015 SPHI, 5035 KPNI, 

ProLeuArgAiaTrpArgHisArgAlaArgSerValArgAlaArgLeuLeuAlaArgGly 
5042 CCCTTGCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCCAGAGGA 
GGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCCT 

A A 

5064 APAI, 5091 BALI, 

GlyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrpAlaValArgThrLysLeuLys 
5102 GGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAGCTCAAA 
CCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTCGAGTTT 

A 

5113 NDEI, 

LeuThrProIleAlaAlaAlaGlyGlnLeuAspLeuSerGlyTrpPheThrAlaGlyTyr 
5162 CTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCTGGCTAC 
GAGTGAGGTTATCGCCGGCGACCGGTC6ACCTGAACAGGCCGACCAAGTGCCGACCGATG 

^ A A A 

5174 NOTI, 5175 EAGl XMA3, 5182 BALI, 5186 PVU2, 

SerGlyGlyAspIleTyrHisSerValSerHisAIaArgProArgTrpIieTrpPheCys 
5222 AGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGGTTT TGC 
TCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACCAAAACG 

A 

5240 DRA3, 

LeuLeuLeuLeuAlaAlaGlyValGlylleTyrLeuLeuProAsnArgMetSerThrAsn 
r 5282 CTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGAATGAGCACGAAT 
GATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGGGTTGGCTTACTCGTGCTTA 

A 

5295 PSTI, 

ProLysProGlnArgLysThrLysArgAsnThrAsnArgArgProGlnAspValLysPhe 
5342 CCTAAACCTCAAAGAAAGACCAAACGT AACACCAACCGGCGGCCGCAGGACGTCAAGTTC 
GGATTTGGAGTTTCTTTCTGGTTTGCATTGTGGTTGGCCGCCGGCGTCCTGCAGTTCAAG 

A A A A 

5380 NOTI, 5381 EAGl XMA3, 5390 AAT2, 5401 S^4AI XMAI, 

ProGlyGlyGlyGlnlleValGlyGlyValTyrLeuLeuProArgArgGlyProArgLeu 
5402 CCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCTAGATTG 
GGCCCACCGCCAGTCTAGCAACCACCTCAAATGAACAACGGCGCGTCCCCGGGATCTAAC 
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544 9 APAI, 

GlyValArgAlaThrArgLysThrSerGluArgSerGlnProArgGlyArgArgGinPro 
54 62 GGTGTGCGCGCGACGAGAAAGACTTCCGAGCGGTCGCAACCTCGAGGTAGACGTCAGCCT 
CCACACGCGCGCTGCTCTTTCTGAAGGCTCGCCAGCGTTGGAGCTCCATCTGCAGTCGGA 

5467 BSSH2, 5478 XMNI, 5502 XHOI, 5511 AAT2, 

IleProLysAlaArgArgProGluGlyArgThrTrpAiaGlnProGIyTyrProTrpPro 
5522 ATCCCCAAGGCTCGTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTACCCTTGGCCC 
TAGGGGTTCCGAGCAGCCGGGCTCCCGTCCTGGACCCGAGTCGGGCCCATGGGAACCGGG 

5548 ALWNI, 5558 ESPl, 5564 SMAI XMAI, 5568 KPNI, 

LeuTyrGlyAsnGluGlyCysGlyTrpAlaGlyTrpLeuLeuSerProArgGlySerArg 
5582 CTCTATGGCAATGAGGGCTGCGGGTGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGG 
GAGATACCGTTACTCCCGACGCCCACCCGCCCTACCGAGGACAGAGGGGCACCGAGAGCC 

ProSerTrpGlyProThrAspProArgArgArgSerArgAsnLeuGlyLysVallleAsp 
5642 CCTAGCTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGGTCATCGAT 
GGATCGACCCCGGGGTGTCTGGGGGCCGCATCCAGCGCGTTAAACCCATTCCAGTAGCTA 

^ r' ^ 

5650 APAI, 5696 CLAI, 

ThrLeuThrCysGlyPheAlaAspLeuMetGlyTyrlleProLeuValOC AM 
5702 ACCCTTACGTGCGGCTTCGCCGACCTCATGGGGTACATACCGCTCGTCTAATAGTCGAC 
TGGGAATGCACGCCGAAGC6GCTGGAGTACCCCATGTATGGCGAGCAGATTATCAGCTG 

A A 

5724 HGIE2, 5755 SALI, 
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MecAlaAlaTyrAiaAiaGlnGiyTyrLysValLeuValLeuAsn 
2 AGCTTACAAAACAAAATGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTACTCAAC 
TCGAATGTTTTGTTTTACCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTG 

1 HIND3, 24 NDEI, 52 SCAIr 

ProSerValAlaAlaThrLeuGlyPheGlyAlaTyrMetSerLysAlaHisGlylleAsp 
62 CCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGAT 
GGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGCTA 

A 

116 CLAI, 

ProAsnlleArgThrGlyValArgThrlieThrThrGlySerProileThrTyrSerThr 
122 CCTAACATCAGGACCGGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTACTCCACC 
GGA7TGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATGAGGTGG 

TyrGlyLysPheLeuAlaAspGlyGlyCysSerGiyGlyAlaTyrAspIlellelleCys 
182 TACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCTTATGACATAATAATTTGT 
ATGCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTATTAAACA 

AspGluCysHisSerThrAspAlaThrSerlleLeuGlylleGlyXhrValLeuAspGin 
242 GACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATTGGCACTGTCCTTGACCAA 
CTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAACCGTGACAGGAACTGGTT 

AlaGluThrAlaGlyAlaArgLeuValValLeuAlaThrAlaThr?roProGiySerVal 
302 GCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGCTCCGTC 
CGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGGCAG 

A 

303 ALWNl, 

ThryalProHisProAsnlleGluGluValAlaLeuSerThrThrGlyGluIieProPhe 
362 ACTGTGCCCCATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATCCCTTTT 
TGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGAAAA 

TyrGlyLysAiaXleProLeuGluVallieLysGIyGlyArgHisLeuIiePheCysHis 
422 TACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTCTGTCAT 
ATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAGACAGTA 
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SerLysLysLysCysAspGluLeuAlaAlaLysLeuValAlaLeuGlylieAsnAlaVai 
482 TCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAATGCCGTG 
AGTTTCTTCTTCACGCTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTACGGCAC 

AlaTyrTyrArgGlyLeuAspValSerVallleProThrSerGlyAspValValVaiVal 
5 4 2 GCCTACTACCGCGGTCTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTCGTCGTG 
CGGATGATGGCGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAGCAGCAC 

A A 

550 SAC2, 560 DRDl, 

AlaThrAspAlaLeuMetThrGlyTyrThrGlyAspPheAspSerVallleAspCysAsn 
' 602 GCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGACTGCAAT 
CGT.TGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTGACGTTA 

615 BSPHl, 

ThrCysValThrGlnThrValAspPheSerLeuAspProThrPheThrlieGluThrlle 
662 ACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATTGAGACAATC 
TGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTCTGTTAG 

ThrLeuProGlnAspAlaValSerArgThrGlnArgArgGlyArgThrGlyArgGlyLys 
722 ACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGGAAG 
TGCGAGGGGGTTCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACC6TCCCCCTTC 

ProGlylleTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAspSerSer 
782 CCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACTCGTCC 
♦ GGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGCAGG 

A A 

816 BGLI, 833 DRDl, 

ValLeuCysGluCysTyrAspAiaGlyCysAlaTrpTyrGluLeuThrProAlaGluThr 
842 GTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGT ATGAGCTCACGCCCGCCGAGACT 
CAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTCTGA 

A 

881 SACI, 

ThrValArgLexjiArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAspHisLeu 
902 ACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTT 
TGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAA 

A 

931 SMAI XMAI, 

GiuPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHisPheLeuSerGln 
962 GAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAG 
CTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGGTC 



985 STUI, 

ThrLysGlnSerGlyGluAsnLeuProTyrLeuValAlaTyrGlnAiaThrValCysAla 
1022 ACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCT 
TGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGA 

A 

1069 DRA3, 

ArgAlaGlnAlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArgLeuLys 
1082 AGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGCCTCAAG 
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TCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCGGAG7TC 

ProThrLeuHisGlyProThrtroLeuLeuTyrArgLeuGlyAlaValGlnAsnGluIie 
1142 CCCACCCTCCATGGGCCAACACCCCTGCTATAC AGACTGGGC6CTGTTCAGAATGAAATC 
•GGGTGGGAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTACTTTAG 

1150 NCOI, 

ThrLeuThrHisProValThrLysTyrlleMetThrCysMetSerAlaAspLeuGluVal 
1202 ACCCTGACGCACCCAGTCACCAAATACATCATGACATGCATGTCGGCCGACCTGGAGGTC 
TGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGACCTCCAG 

AAA A A 

1230 BSPHl, 1234 DRDl, 1237 AVA3; 1245 EAGl XMA3, 1250 DRDl, 



ValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeuAlaAlaTyrCysLeu 
1262 GTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGCCTG 
CAGTGCTCGTGGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATAACGGAC 

SerThrGlyCysValVallleValGlyArgValValLeuSerGlyLysProAlallelle 
1322 TCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATCATA 
AGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAGTAT 

A 

1369 NAEI, 

ProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGlnHisLeu 
1382 CCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACTTA 
GGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTGAAT 

A 

1385 DRDl, 

ProTyrlleGluGlnGlyMetMetLeuAiaGluGlnPheLysGlnLysAlaLeuGlyLeu 
14 42 CCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCGGCCTC 
GGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAG 

LeuGlnThrAlaSerArgGlnAlaGluVallleAlaProAlaValGlnThrAsnTrpGln 
1502 CTGCAGACCGCGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAACTGGCAA 
GACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACCGTT 

/N A 

1502 PSTI, 1507 TTH3I, 

LysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSerGlylleGlnTyrLeu 
, • 1562 AAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAATACTTG 
TTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATGAAC 

A ^ 

1565 XHOI, 1586 NDEI, 

AlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeuMetAlaPheThrAla 
1622 GCGGGCTTGTCAACGCTGCCTGGT AACCCCGCCATTGCTTCATTGATGGCTTTTACAGCT 
CGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGTCGA 

A 

1643 BSTE2, 1677 ALWNl PVU2, 

AlaValThrSerProLeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGlyGlyTrp 
1682 GCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGG 
CGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACC 
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ValAlaAlaGlnLeuAlaAlaProGlyAlaAlaThrAlaPheValGlyAlaGiyLeuAla 
1742 GTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTAGCT 
CACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAATCGA 

1794 ESPl, 

GlyAlaAlalleGlySerValGlyLeuGlyLysValLeuIleAspIleLeuAlaGlyTyr 
1802 GGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGGTAT 
CCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCCATA 

1802 KASl NARI, 

GlyAiaGlyValAiaGlyAlaLeuValAlaPheLysIleMetSerGiyGluValProSer 
1862 GGGGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGA6GGGTGAGGTCCCCTCC 
CCGC6CCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGG 

1878 SACI, 1899 BSPHl, 

ThrGluAspLeuValAsnLeuLeuProAlalleLeuSerProGlyAlaLeuVaiValGly 
1922 ACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTAGTCGGC 
TGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAGCCG 

1928 TTH3I, 

ValValCysAlaAlalleLeuArgArgHisValGlyProGIyGluGlyAlaValGlnTrp 
1982 GTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGG 
CACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACC 

^ A 

2004 NAEI, 2017 SMAI XMAI, 

MetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSerProThrHisTyrVal 
2042 ATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGCACT ACGTG 
TACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCAC 

2067 SMAI XMAI, 2093 DRA3, 

ProGluSerAspAiaAlaAlaArgValThrAlalleLeuSerSerLeuThrValThrGln 
2102 CCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACCCAG 
GGCCTCTCGCTACGTCGACGGGCGCAGTGACG6TATGAGTCGTCGGAGTGACATTGGGTC 

2115 PVU2, 2159 ALWNl, 

LeuLeuArgArgLeuHisGlnTrpIleSerSerGluCysThrThrProCysSerGlySer 
2162 CTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGTTCC 
GAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGGCCAAGG 

2164 MST2, 2220 ECONl, 

TrpLeuArgAspIleTrpAspTrpIleCysGluValLeuSerAspPheLysThrTrpLeu 
2222 TGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACCTGGCTA 
ACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGGACCGAT 

LysAlaLysLeuMetProGlnLeuProGlylleProPheValSerCysGlnArgGlyTyr 
2282 AAAGCT AAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGGTAT 
TTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATA 

^ A A. 

2285 ESPl, 2300 PVU2, 2310 BAMHI, 
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LysGlyValTrpArgGlyAspGlylleMetHisThrArgCysHisCysGlyAlaGluIle 
2342 AAGGGGGTCTGGCGAGGGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAGATC 
TTCCCCCAGACCGCTCCCCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGACTCTAG 

ThrGlyHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArgAsnMet 
2402 ACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATG 
TGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTGTAC 

A AAA 

2425 BSABl, 2441 AVR2, 2448 SSE83871, 2449 PSTI, 

TrpSerGlyThrPheProIleAsnAlaTyrThrThrGlyProCysThrProLeuProAla 
24 62 TGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCTGCG 
ACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGACGC 

A A 

2480 ASEl, 2497 APAI, 

ProAsnTyrThrPheAlaLeuTrpArgValSerAlaGluGluTyrValGluIleArgGln 
2522 CCGAACTACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGAATACGTGGAGATAAGGCAG 
GGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTATGCACCTCTATTCCGTC 

A 

2553 PSTI, 

ValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysProCysGln 
2582 GTGGGGGACTTCCACTACGTGACGGGTATGACTACTGACAATCTTAAATGCCCGTGCCAG 
CACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAGAATTTACGGGCACGGTC 

A 

2594 DRA3, 

ValProSerProGluPhePheThrGluLeuAspGlyValArgLeuHisArgPheAlaPro 
264 2 GTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCC 
CAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAACGCGGG 

ProCysLysProLeuLeuArgGluGluValSerPheArgValGlyLeuHisGluTyrPro 
2702 CCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAATACCCG 
GGGACGTTCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTTATGGGC 

A 

2757 HG1E2, 

ValGlySerGlnLeuProCysGluProGluProAspValAlaValLeuThrSerMetLeu 
2762 GTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCATGCTC 
CATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTACGAG 

2809 AAT2, 

ThrAspProSerHisIleThrAlaGluAlaAlaGlyArgArgLeuAlaArgGlySerPro 
2822 ACTGA7CCCTCCCATATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCACCC 
TGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCTAGTGGG 

A 

2850 EAGl XMA3, 

ProSerValAlaSerSerSerAlaSerGlnLeuSerAlaProSerLeuLysAlaThrCys 
2882 CCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCT ATCCGCTCCATCTCTCAAGGC AACTTGC 
GGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGTTGAACG 

^ A 

2889 BALI, 2903 NHEI, 
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ThrAlaAsnHisAspSerProAspAlaGluLeuIleGluAlaAsnLeuLeutrpArgGln 
2942 ACCGCT AACCATGACTCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGGAGGCAG 
TGGCGATTGGTACTGAGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACCTCCGTC 

A A 

,. 2966 ESPl, 2969 SACI, 

GluMetGlyGlyAsnlleThrArgValGluSerGluAsnLysValVallleLeuAspSer 
3002 GAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTGGACTCC 
CTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGACCTGAGG 

PheAspProLeuValAiaGluGluAspGluArgGluIleSerValProAlaGluIleLeu 
3062 TTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAAATCCTG 
AAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTTTAGGAC 

3096 BGL2, 

ArgLysSerArgArgPheAlaGlnAlaLeuProValTrpAlaArgProAspTyrAsnPro 
3122 CGGAAGTCTCGGAGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTAT AACCCC 
GCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATTGGGG 

A ^ 

314 3 ALWNl, 3164 EAGl XMA3, 

ProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGlyCysPro 
3182 CCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGCCCG 
GGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACGGGC 

A A 

3217 HGIE2, 3229 NCOI, 

LeuProProProLysSerProProValProProProArgLysLysArgThrValValLeu 
3242 CTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTGGTCCTC 
GAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAG 

ThrGluSerThrLeuSerThrAlaLeuAlaGluLeuAiaThrArgSerPheGlySerSer 
3302 ACTGAATCAACCCTATCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGC AGCTCC 
TGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCGAGG 

A A 

3332 SACI, 3346 HIND3, 

SerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGluProAlaProSerGly 
3362 TCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCTGGC 
AGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGAAGACCG 

CysProProAspSerAspAiaGluSerTyrSerSerMetProProLeuGluGlyGluPro 
3422 TGCCCCCCCGACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCT 
ACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTCGGA 

A 

3437 EAM11051, 

GlyAspProAspLeuSerAspGlySerTrpSerThrValSerSerGlaAlaAsnAlaGlu 
34 82 GGGGATCCGGATCTTAGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAACGCGGAG 
CCCCTAGGCCTAGAATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCC6GTTGCGCCTC 

3484 BAMHI, 3485 B5AB1, 3487 BSPEl, 

AspValValCysCysSerMetSerTyrSerTrpThrGlyAlaLeuValThrProCysAla 
3542 GATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCGTGCGCC 
CTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACGCGG 
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3589 DRA3, 3600 SAC2, - 

AlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHisHisAsn 
3602 ... GCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTCACCACAAT 
CGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGTGTTA 

3611 ALWNl, 3655 PFLMl, 

LeuValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLysLysValThrPheAsp 
3662 TTGGTGTATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTTGAC 
AACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAGTGTAAACTG 

3681 DRA3, 

ArgLeuGlnValLeuAspSerHisTyrGlnAspValLeuLysGluValLysAlaAlaAla 
3722 AGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCAGCGGCG 
TCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGTCGCCGC 

SerLysValLysAlaAsnLeuLeuSerValGluGluAlaCysSerLeuThrProProHis 
3782 TCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCCCCACAC 
AGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGGGGTGTG 

A 

3816 HIND3, 

SerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAlaArgLysAla 
3842 TCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAGGCC 
. AGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTCCGG 

A A 

3875 AAT2, 3890 BGLI, 

ValThrHisIleAsnSerValTrpLysAspLeuLeuGluAspAsnValThrProIleAsp 
3902 GTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAATAGAC 
CATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTTATCTG 

ThrThrlleMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGlyArgLys 
3962 ACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGTCGTAAG 
TGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCATTC 

ProAlaArgLeuIleValPheProAspLeuGlyValArgValCysGluLysMetAlaLeu 
4022 CCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTG 
GGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGAAAC 

TyrAspValValThrLysLeuProLeuAlaValMetGIySerSerTyrGlyPheGlnTyr 
4 082 TACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAATAC 
ATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGGTTATG 

SerProGlyGlnArgValGluPheLeuValGlnAiaTrpLysSerLysLysThrProMei: 
4142 TCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATG 
AGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGGGGTTAC 

A 

4160 ECORI, 

GlyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIleArgThr 
4202 GGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTGAGAGCGACATCCGTACG 
CCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAGGCATGC 
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4229 DRDl, 4236 ALWNlr 

GluGluAlalleTyrGinCysCysAspLeuAspProGlnAlaArgValAlalleLysSer 
4262 GAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAGTCC 
r CTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAGTTCAGG 

A A 

4301 BGLI, 4308 BALI/ 

LeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsnCysGly 
4 322 CTCACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGCGGC 
GAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTGACGCCG 

A 

4 34 5 APAI, 

TyrArgArgCysArgAlaSerGlyValLeuThrThrSerCysGlyAsnThrLeuThrCys 
4382 T ATCGCAGGTGCCGCGCGAGCGGCGT ACTGACAACTAGCTGTGGT AACACCCTCACTTGC 
ATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGAACG 

TyrlleLysAlaArgAlaAlaCysArgAlaAlaGlyLeuGlnAspCysThrMetLeuVal 
44 4 2 TAgATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTG 
ATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCAC 

A • 

4452 SMAI XMAI, 

CysGlyAspAspLeuValVallleCysGluSerAlaGlyValGlnGluAspAlaAlaSer 
4502 TGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCGAGC 
ACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCG 

4508 DRDl, 4511 TTH3I, 

LeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGlyAspProProGln 
4562 CTGAGAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCACAA 
GACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTT 

ProGluTyrAspLeuGiuLeuIleThrSerCysSerSerAsnValSerValAlaHisAsp 
4 622 CCAG AATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCCC ACGAC 
GGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTGCTG 

A 

4 637 SACI, 

GlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAlaArgAla 
4 682 GGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGCT 
CCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCGA 

A 

4731 NRUI, 

AlaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelleMetPhe 
4742 GCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATCATGTTT 
CGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAAA 

AlaProThrLeuTrpAlaArgMetlleLeuMetThrHisPhePheSerValLeuIleAla 
4802 GCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGCC 
CGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGG 

A A 

4806 PFLMl, 4807 DRA3, 

ArgAspGlnLeuGluGlnAlaLeuAspCysGluIleTyrGlyAlaCysTyrSerlleGlu 
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4862 AGGGACCAGCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACtCCATAGAA 
TCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCTT 

4893 BGL2, 

PrqLeuAspLeuProProIlelleGlnArgLeuHisGlyLeuSerAlaPheSerLeuHis 
49.22 CCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCACTCCAC 
GGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGTGAGGTG 

4954 NCOr, 

SerTyrSerProGlyGluIleAsnArgValAlaAlaCysLeuArgLysLeuGlyValPro 
4 982 AGTTACTCTCCAGGTGAAATCAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGTACCG 
TCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCCCATGGC 

A A 

5015 SPHI, 5035 KPNI, 

ProLeuArgAlaTrpArgHisArgAlaArgSerValArgAlaArgLeuLeuAlaArgGly 
5042 CCCTTGCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCCAGAGG A 
GGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCCT 

A A 

5064 APAI, 5091 BALI, 

GlyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrpAlaValArgThrLysLeuLys 
5102 GGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAGCTCAAA 
CCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTCGAGTTT 

5113 NDEl, 

LeuThrProIleAlaAlaAlaGlyGlnLeuAspLeuSerGlyTrpPheThrAlaGlyTyr 
5162 CTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCTGGCTAC 
GAGTGAGGTTATCGCCGGCGACCGGTCGACCTGAACAGGCCGACCAAGTGCCGACCGATG 

A A A A 

5174 NOTI, 5175 EAGl XMA3, 5182 BALI, 5186 PVU2, 

SerGlyGlyAspIleTyrHisSerValSerHisAlaArgProArgTrpIleTrpPheCys 
5222 AGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGGTTTTGC 
TCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACCAAAACG 

A 

5240 DRA3, 

LeuLeuLeuLeuAlaAlaGlyValGlylleTyrLeuLeuProAsnArgMetSerThrAsn 
5282 CTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGAATGAGCACGAAT 
GATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGG6TTGGCTTACTCGTGCTTA 

A 

5295 PSTI, 

ProLysProGlnArgLysThrLysArgAsnThrAsnArgArgProGlnAspValLysPhe 
5342 CCTAAACCTCAAAGAAAGACCAAACGTAACACCAACCGGCGGCCGCAGGACGTCAAGTTC 
GGATTTGGAGTTTCTTTCTGGTTTGCATTGTGGTTGGCCGCCGGCGTCCTGCAGTTCAAG 

A A A A 

5380 NOTI, 5381 EAGl XMA3, 5390 AAT2, 5401 SMAI XMAI, 

ProGlyGlyGlyGinlleValGlyGlyVaiTyrLeuLeuProArgArgGlyProArgLeu 
5402 CCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCTAGATTG 
GGCCCACCGCCAGTCTAGCAACCACCTCAAATGAACAACGGCGCGTCCCCGGGATCTAAC 
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5449 APAI, 

GlyValArgAlaThrArgLysThrSerGluArgSerGlnProArgGlyArgArgGlnPro 
5462 GGTGTGCGCGCGACGAGAAAGACTTCCGAGCGGTCGCAACCTCGAGGTAGACGTCAGCCT 
' ...CCACACGCGCGCTGCTCTTTCT6AAGGCTCGCCAGCGTTGGAGCTCCATCTGCAGTCGGA 

5467 BSSH2r 5478 XMNI, 5502 XHOI, 5511 AAT2, 

IleProLysAlaArgArgProGluGlyArgThrTrpAlaGlnProGlyTyrProTroPro 
5522 ATCCCCAAGGCTCGTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTACCCTTGGCCC 
TAGGGGTTCCGAGCAGCCGGGCTCCCGTCCTGGACCCGAGTCGGGCCCATGGGAACCGGG 

5548 ALWNl, 5558 ESPl, 5564 SMAI XMAI, 5568 KPNI, 

LeuTyrGlyAsnGluGlyCysGlyTrpAlaGlyTrpLeuLeuSerProArgGlySerArg 
5582 CTCTATGGCAATGAGGGCTGCGGGTGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGG 
GAGATACCGTTACTCCCGACGCCCACCCGCCCTACCGAGGACAGAGGGGCACCGAGAGCC 

ProSerTrpGlyProThrAspProArgArgArgSerArgAsnLeuGlyLysVallleAsp 
5642 CCTAGCTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGGTCATCGAT 
GGATCGACCCCGGGGTGTCTGGGGGCCGCATCCAGCGCGTTAAACCCATTCCAGTAGCTA 

5650 APAl, 5696 CLAI, 

ThrLeuThrCysGlyPheAlaAspLeuMetGlyTyrlleProLeuValGlyAlaProLeu 
5702 ACCCTTACGTGCGGCTTCGCCGACCTCATGGGGTACATACCGCTCGTCGGCGCCCCTCTT 
TGGGAATGCACGCCGAAGCGGCTGGAGTACCCCATGTATGGCGAGCAGCCGCGGGGAGAA 

5724 HGIE2, 5750 KASl NARI, 5756 ECONl, 

GlyGlyAlaAlaArgAlaOC AM 
57 62 GGAGGCGCTGCCAGGGCCTAATAGTCGAC 
CCTCCGCGACGGTCCCGGATTATCAGCTG 

5785 SALI, 
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<110> CHIRON CORPORATION et al. 

<120> NOVEL HCV NON-STRUCTURT^ POLYPEPTIDE 

<130> PP01617.003 

<140> 
<141> 

<160> 19 

<170> PatentIn Ver. 2.0 

<2I0> 1 
<211> 9620 
<212> DNA 

<213> Artificial Sequence 

<220> 
<221> CDS 

<222> (1990) (7302) 
<220> 

<223> Description of Artificial Sequence: Hepatitis C pns345 
<400> 1 



cgcgcgtttc 


ggtgatgacg 


gtgaaaacct 


ctgacacatg 


cagctcccgg 


agacggtcac 


60 


agcttgtctg 


taagcggatg 


ccgggagcag 


acaagcccgt 


cagggcgcgt 


cagcgggtgt 


120 


tggcgggtgt 


cggggctggc 


ttaactatgc 


ggcatcagag 


cagattgtac 


tgagagtgca 


180 


ccatatgaag 


ctttttgcaa 


aagcctaggc 


ctccaaaaaa 


gcctcctcac 


tacttctgga 


240 


atagctcaga 


ggccgaggcg 


gcctcggcct 


ctgcataaat 


aaaaaaaatt 


agtcagccat 


300 


ggggcggaga 


atgggcggaa 


ctgggcgggg 


agggaattat 


tggctattgg 


ccattgcata 


360 


cgttgtatct 


atatcataat 


atgtacattt 


atattggctc 


atgtccaata 


tgaccgccat 


420 


gttgacattg 


attattgact 


agttattaat 


agtaatcaat 


tacggggtca 


ttagttcata 


480 


gcccatatat 


ggagttccgc 


gttacataac 


ttacggtaaa 


tggcccgcct 


ggctgaccgc 


540 


ccaacgaccc 


ccgcccattg 


acgtcaataa 


tgacgtatgt 


tcccatagta 


acgccaatag 


600 


ggactttcca 


ttgacgtcaa 


tgggtggagt 


atttacggta 


aactgcccac 


ttggcagtac 


660 


atcaagtgta 


tcatatgcca 


agtccgcccc 


ctattgacgt 


caatgacggt 


aaatggcccg 


720 


cctggcatta 


tgcccagtac 


atgaccttac 


gggactttcc 


tacttggcag 


tacatctacg 


780 


tattagtcat 


cgctattacc 


atggtgatgc 


ggttttggca 


gtacaccaat 


gggcgtggat 


840 


agcggtttga 


ctcacgggga 


tttccaagtc 


tccaccccat 


tgacgtcaat 


gggagtttgt 


900 
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tttggcacca 


aaatcaacgg 


gactttccaa 


aatgtcgtaa 


taaccccgcc 


cegttgacgc 


960 


aaatgggcgg 


taggcgtgta 


cggtgggagg 


tctatataag 


cagagetcgt 


ttagtgaacc 


1020 


gtcagatcgc 


ctggagacgc 


catccacgct 


gttttgacet 


eeatagaaga 


caeegggaee 


1080 


gatccagcct 


ccgcggccgg 


gaacggtgca 


ttggaaegeg 


gatteeeegt 


gceaagagtg 


1140 


acgtaagtac 


cgcctataga 


ctctataggc 


acacecettt 


ggetcttatg 


catgctatac 


1200 


tgtttttggc 


ttggggccta 


tacacccccg 


ctecttatgc 


tataggtgat 


ggtatagctt 


1260 


agcctatagg 


tgtgggttat 


tgaccattat 


tgaecaetec 


eetattggtg 


aegatacttt 


1320 


ccattactaa 


tccataacat 


ggctctttgc 


caeaactatc 


tctattggct 


atatgccaat 


1380 


actctgtcct 


tcagagactg 


acacggactc 


tgtattttta 


eaggatgggg 


teeatttatt 


1440 


atttacaaat 


tcacatatac 


aacaacgccg 


tceceegtge 


ecgeagtttt 


tattaaaeat 


1500 


agcgtgggat 


ctccgacatc 


tcgggtacgt 


gtteeggaea 


tgggetette 


teeggtageg 


1560 


gcggagcttc 


cacatccgag 


ccctggtccc 


atccgtecag 


cggeteatgg 


tcgctcggea 


1620 


gctccttgct 


cctaacagtg 


gaggccagac 


ttaggcacag 


cacaatgccc 


accaccacca 


1680 


gtgtgccgca 


caaggccgtg 


gcggtagggt 


atgtgtetga 


aaatgagctc 


ggagattggg 


1740 


ctcgcacctg 


gacgcagatg 


gaagacttaa 


ggeagcggea 


gaagaagatg 


caggeagctg 


1800 


agttgttgta 


ttctgataag 


agtcagaggt 


aaeteeegtt 


geggtgetgt 


taaeggtgga 


1860 


gggcagtgta 


gtctgagcag 


tactcgttgc 


tgccgegege 


geeaeeagae 


ataatagetg 


1920 


acagactaac 


agactgttcc 


tttccatggg 


tettttetge 


agteacegtc 


gtegaeetaa 


1980 


gaattcacc atg get gca 
Met Ala Ala 


tat gca get eag ggc tat aag gtg eta gta etc 
Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu 


2031 



15 10 

aae cce tet gtt get gea aea etg gge ttt ggt get tac atg tec aag 2079 

Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys 

15 20 25 30 

get eat ggg ate gat ect aae ate agg ace ggg gtg aga aea att ace 2127 

Ala His Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr 

35 40 45 

act gge age eee ate aeg tae tee aec tae gge aag tte ctt gee gae 2175 

Thr Gly Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp 
50 55 60 

ggc ggg tgc teg ggg ggc get tat gae ata ata att tgt gae gag tge 2223 

Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys 
65 70 75 

cac tee aeg gat gee aea tec ate ttg gge att gge act gte ctt gae 2271 

His Ser Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp 
80 85 90 
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caa gca gag act gcg ggg gcg aga ctg gtt gtg etc gcc acc gcc acc 2319 
Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr 
95 100 105 110 

cct ccg ggc tec gtc act gtg ccc cat ccc aac ate gag gag gtt get 2367 
Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala 
115 120 125 

ctg tec acc acc gga gag ate cct ttt tac ggc aag get ate ccc etc 2415 
Leu Ser Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu 
130 135 140 

gaa gta ate aag ggg ggg aga cat etc ate ttc tgt cat tea aag aag 2463 
Glu Val He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys 
145 150 155 

aag tgc gac gaa etc gcc gca aag ctg gtc gca ttg ggc ate aat gee 2511 
Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly He Asn Ala 
160 165 170 

gtg gee tac tac cge ggt ett gac gtg tee gtc ate ccg ace age ggc 2559 
Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly 
175 180 185 190 

gat gtt gtc gtc gtg gca acc gat gcc etc atg acc ggc tat acc ggc 2607 
Asp Val Val Val val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly 
195 200 205 

gac ttc gac teg gtg ata gac tgc aat aeg tgt gtc acc cag aca gtc 2655 
Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val 

210 215 220 

gat ttc age ett gac cct acc ttc acc att gag aca ate acg etc ccc 2703 
Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr He Thr Leu Pro 
225 230 235 

caa gat get gtc tec cgc act caa cgt egg ggc agg act ggc agg ggg 2751 
Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly 
240 245 250 

aag cea ggc ate tac aga ttt gtg gca ccg ggg gag cgc ccc tec ggc 2799 
Lys Pro Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly 
255 260 265 270 

atg ttc gac teg tec gtc etc tgt gag tgc tat gac gca ggc tgt get 2847 
Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala 
275 280 285 

tgg tat gag etc acg ccc gcc gag act aca gtt agg eta cga gcg tac 2895 
Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr 
290 295 300 

atg aac acc ccg ggg ett ccc gtg tgc cag gac cat ett gaa ttt tgg 2943 
Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp 
305 310 315 

gag ggc gtc ttt aca ggc etc act eat ata gat gcc cae ttt eta tee 2991 
Glu Gly Val Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser 
320 325 330 
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cag aca aag cag agt ggg gag aac ctt cct tac ctg gta gcg tac caa 3039 
Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin 
335 340 345 350 

gcc acc gtg tgc get agg get caa gcc cct ccc cca teg tgg gac cag 3087 
Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin 
355 360 365 

atg tgg aag tgt ttg att cgc etc aag ccc acc etc cat ggg cca aca 3135 
Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr 
370 375 380 

ccc ctg eta tac aga ctg ggc get gtt cag aat gaa ate ace ctg aeg 3183 
Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie Thr Leu Thr 
385 390 395 

cac cca gtc acc aaa tac ate atg aca tgc atg teg gcc gac ctg gag 3231 
His Pro Val Thr Lys Tyr He Met Thr Cys Met Ser Ala Asp Leu Glu 
400 405 410 

gtc gtc aeg age acc tgg gtg etc gtt ggc ggc gtc ctg get get ttg 3279 
Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu 
415 420 425 430 

gcc gcg tat tgc ctg tea aca ggc tgc gtg gtc ata gtg ggc agg gtc 3327 
Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg Val 
435 440 445 

gtc ttg tec ggg aag ccg gea ate ata cct gac agg gaa gtc etc tac 3375 
Val Leu Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr 
450 455 460 

cga gag ttc gat gag atg gaa gag tgc tet cag cac tta ccg tac ate 3423 
Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr He 
465 470 475 

gag caa ggg atg atg etc gcc gag cag ttc aag cag aag gcc etc ggc 3471 
Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly 
480 485 490 

etc ctg cag acc gcg tec cgt cag gea gag gtt ate gee cct get gtc 3519 
Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala Val 
495 500 505 510 

cag acc aac tgg caa aaa etc gag acc ttc tgg gcg aag cat atg tgg 3567 
Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp 
515 520 525 

aac ttc ate agt ggg ata caa tac ttg gcg ggc ttg tea aeg ctg cct 3615 
Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro 
530 535 540 

ggt aac ccc gcc att get tea ttg atg get ttt aca get get gtc acc 3663 
Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr 
545 550 555 

age cca eta acc act age caa acc etc etc ttc aac ata ttg ggg ggg 3 711 
Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He Leu Gly Gly 
560 565 570 
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tgg gtg get gcc cag etc gee gee cce ggt gee get act gee ttt gtg 3759 
Trp Val Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val 
575 580 585 590 

ggc get ggc tta get ggc gcc gcc ate ggc agt gtt gga ctg ggg aag 3807 
Gly Ala Gly Leu Ala Gly Ala Ala lie Gly Ser Val Gly Leu Gly Lys 
595 600 605 

gtc etc ata gae ate ctt gea ggg tat ggc gcg ggc gtg gcg gga get 3855 
Val Leu lie Asp lie Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala 
610 615 620 

ctt gtg gea ttc aag ate atg age ggt gag gtc cce tec aeg gag gae 3903 
Leu Val Ala Phe Lys lie Met Ser Gly Glu Val Pro Ser Thr Glu Asp 
625 630 635 

ctg gtc aat eta ctg cce gcc ate etc teg cce gga gcc etc gta gtc 3951 
Leu Val Asn Leu Leu Pro Ala lie Leu Ser Pro Gly Ala Leu Val Val 

640 645 650 

ggc gtg gtc tgt gea gea ata ctg cgc egg eac gtt ggc ceg ggc gag 3999 
Gly Val Val Cys Ala Ala lie Leu Arg Arg His Val Gly Pro Gly Glu 
655 660 665 670 

ggg gea gtg cag tgg atg aac egg ctg ata gee ttc gcc tec egg ggg 4047 
Gly Ala Val Gin Trp Met Asn Arg Leu lie Ala Phe Ala Ser Arg Gly 
675 680 685 

aac cat gtt tec cce aeg eac tac gtg ceg gag age gat gea get gcc 4095 
Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala 
690 695 700 

cgc gtc act gee ata etc age age etc act gta ace cag etc ctg agg 4143 
Arg Val Thr Ala lie Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg 
705 710 715 

cga ctg eac cag tgg ata age teg gag tgt acc act cca tgc tec ggt 4191 
Arg Leu His Gin Trp lie Ser Ser Glu Cys Thr Thr Pro cys Ser Gly 
720 725 730 

tee tgg eta agg gae ate tgg gae tgg ata tgc gag gtg ttg age gae 4239 
Ser Trp Leu Arg Asp lie Trp Asp Trp lie Cys Glu Val Leu Ser Asp 
735 740 745 750 

ttt aag acc tgg eta aaa get aag etc atg cca cag ctg cet ggg ate 4287 
Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly lie 
755 760 765 

cce ttt gtg tec tgc cag cgc ggg tat aag ggg gtc tgg cga ggg gae 4335 
Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp 
770 775 780 

ggc ate atg cac act cgc tgc eac tgt gga get gag ate act gga eat 4383 
Gly lie Met His Thr Arg Cys His Cys Gly Ala Glu lie Thr Gly His 
785 790 795 

gtc aaa aac ggg aeg atg agg ate gtc ggt ect agg acc tgc agg aac 4431 
Val Lys Asn Gly Thr Met Arg lie Val Gly Pro Arg Thr Cys Arg Asn 
800 805 810 
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atg tgg agt ggg acc ttc ccc att aat gcc tac acc acg ggc ccc tgt 4479 
Met Trp Ser Gly Thr Phe Pro lie Asn Ala Tyr Thr Thr Gly Pro Cys 
815 820 825 830 

acc ccc ctt cct gcg ccg aac tac acg ttc gcg eta tgg agg gtg tct 4527 
Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser 
835 840 845 

gca gag gaa tac gtg gag ata agg cag gtg ggg gac ttc cac tac gtg 4575 
Ala Glu Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe His Tyr Val 
850 855 860 

acg ggt atg act act gac aat ctt aaa tgc ccg tgc cag gtc cca teg 4623 
Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser 
865 870 875 

ccc gaa ttt ttc aca gaa ttg gac ggg gtg cgc eta cat agg ttt gcg 4671 
Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala 

880 885 890 

ccc CCC tgc aag ccc ttg ctg egg gag gag gta tea ttc aga gta gga 4719 
Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly 
895 900 905 910 

etc cac gaa tac ccg gta ggg teg caa tta cct tgc gag ccc gaa ccg 4767 
Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro 
915 920 925 

gac gtg gcc gtg ttg acg tec atg etc act gat ccc tec eat ata aca 4815 
Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His He Thr 
930 935 940 

gca gag gcg gee ggg cga agg ttg gcg agg gga tea ccc ccc tct gtg 4 863 
Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val 
945 950 955 

gcc age tec teg get age cag eta tee get cea tct etc aag gca act 4911 
Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr 
960 965 970 

tgc acc get aac cat gac tec cct gat get gag etc ata gag gee aac 4959 
Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu He Glu Ala Asn 
975 980 985 990 

etc eta tgg agg cag gag atg ggc ggc aac ate acc agg gtt gag tea 5007 
Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He Thr Arg Val Glu Ser 
995 1000 1005 

gaa aac aaa gtg gtg att ctg gac tee ttc gat ccg ctt gtg gcg gag 5055 
Glu Asn Lys Val Val He Leu Asp Ser Phe Asp Pro Leu Val Ala Glu 
1010 1015 1020 

gag gac gag egg gag ate tec gta ccc gca gaa ate ctg egg aag tct 5103 
Glu Asp Glu Arg Glu He Ser Val Pro Ala Glu He Leu Arg Lys Ser 
1025 1030 1035 

egg aga ttc gcc cag gcc ctg ccc gtt tgg gcg egg ccg gac tat aac 5151 
Arg Arg Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn 
1040 1045 1050 
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ccc ccg eta gtg gag acg tgg aaa aag ccc gac tac gaa cca cct gtg 5199 
Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val 
1055 1060 1065 1070 

gtc cat ggc tgc ccg ctt cca cct cca aag tec cct cct gtg cct ccg 5247 
Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro 
1075 1080 1085 

cct egg aag aag egg acg gtg gtc etc act gaa tea acc eta tct act 5295 
Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr 
1090 1095 1100 

gcc ttg gcc gag etc gcc acc aga age ttt ggc age tec tea act tec 5343 
Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser 
1105 1110 1115 

ggc att acg ggc gac aat acg aca aca tec tct gag ccc gcc cct tct 5391 
Gly lie Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser 
1120 1125 1130 

ggc tgc ccc ccc gac tec gac get gag tec tat tec tec atg ccc ccc 5439 
Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro 
1135 1140 1145 1150 

ctg gag ggg gag cct ggg gat ccg gat ctt age gac ggg tea tgg tea 5487 
Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser 
1155 1160 1165 

acg gtc agt agt gag gcc aac gcg gag gat gtc gtg tgc tgc tea atg 5535 
Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met 
1170 1175 1180 

tct tac tct tgg aca ggc gea etc gtc acc ccg tgc gcc gcg gaa gaa 5583 
Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu 
1185 1190 1195 

cag aaa ctg ccc ate aat gea eta age aac teg ttg eta cgt cac cac 5631 
Gin Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His 
1200 1205 1210 

aat ttg gtg tat tec acc acc tea egc agt get tgc caa agg cag aag 5679 
Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys 
1215 1220 1225 1230 

aaa gtc aca ttt gac aga ctg caa gtt ctg gac age cat tac cag gac 5727 
Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp 
1235 1240 1245 

gta etc aag gag gtt aaa gea gcg gcg tea aaa gtg aag get aac ttg 5775 
Val Leu Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu 
1250 1255 1260 

eta tec gta gag gaa get tgc age ctg acg ccc cca cac tea gee aaa 5823 
Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys 
1265 1270 1275 

tec aag ttt ggt tat ggg gea aaa gac gtc cgt tgc cat gcc aga aag 5871 
Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys 
1280 1285 1290 
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gcc gta acc cac ate aac tec gtg tgg aaa gac ctt etg gaa gac aat 5919 
Ala Val Thr His He Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn 
1295 1300 1305 1310 

gta aca cca ata gac act acc ate atg get aag aac gag gtt ttc tgc 5967 
Val Thr Pro He Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys 
1315 1320 1325 

gtt cag cct gag aag ggg ggt cgt aag cca get cgt etc ate gtg ttc 6015 
Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe 
1330 1335 1340 

ccc gat ctg ggc gtg cgc gtg tge gaa aag atg get ttg tac gac gtg 6063 
Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val 
1345 1350 1355 

gtt aca aag etc ccc ttg gcc gtg atg gga age tec tac gga ttc caa 6111 
Val Thr Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin 
1360 1365 1370 

tac tea cca gga eag egg gtt gaa ttc etc gtg caa gcg tgg aag tec 6159 
Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser 
1375 1380 1385 1390 

aag aaa acc cca atg ggg ttc teg tat gat acc cgc tgc ttt gac tec 6207 
Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser 
1395 1400 1405 

aca gtc act gag age gac ate cgt acg gag gag gca ate tac caa tgt 6255 
Thr Val Thr Glu Ser Asp He Arg Thr Glu Glu Ala He Tyr Gin Cys 
1410 1415 1420 

tgt gac etc gac ccc caa gcc cgc gtg gcc ate aag tec etc acc gag 6303 
Cys Asp Leu Asp Pro Gin Ala Arg Val Ala He Lys Ser Leu Thr Glu 
1425 1430 1435 

agg ctt tat gtt ggg ggc cct ctt ace aat tea agg ggg gag aac tgc 6351 
Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys 
1440 1445 1450 

ggc tat cgc agg tgc cgc gcg age ggc gta ctg aca act age tgt ggt 6399 
Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly 
1455 1460 1465 1470 

aac acc etc act tgc tac ate aag gcc egg gca gcc tgt ega gcc gca 6447 
Asn Thr Leu Thr Cys Tyr He Lys Ala Arg Ala Ala Cys Arg Ala Ala 
1475 1480 1485 

ggg etc cag gac tgc acc atg etc gtg tgt ggc gac gac tta gtc gtt 6495 
Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val 
1490 1495 1500 

ate tgt gaa age gcg ggg gtc eag gag gac gcg gcg age ctg aga gee 6543 
He Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala 
1505 1510 1515 

ttc acg gag get atg acc agg tac tec gee ccc cct ggg gac ccc cca 6591 
Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro 
1520 1525 1530 
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caa cca gaa tac gac ttg gag etc ata aca tea tgc tec tec aac gtg 6639 
Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val 
1535 1540 1545 1550 

tea gtc gcc cac gac ggc get gga aag agg gtc tac tac etc acc cgt 6687 
Ser Val Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg 
1555 1560 1565 

gac ect aca acc ccc etc gcg aga get gcg tgg gag aca gea aga cac 6735 
Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His 
1570 1575 1580 

act eea gtc aat tec tgg eta ggc aac ata ate atg ttt gee ccc aca 6783 
Thr Pro Val Asn Ser Trp Leu Gly Asn He He Met Phe Ala Pro Thr 
1585 1590 1595 

ctg tgg gcg agg atg ata ctg atg acc cat ttc ttt age gtc ctt ata 6831 
Leu Trp Ala Arg Met lie Leu Met Thr His Phe Phe Ser Val Leu He 
1600 1605 1610 

gcc agg gac cag ctt gaa cag gcc etc gat tgc gag ate tac ggg gcc 6879 
Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He Tyr Gly Ala 
1615 1620 1625 1630 

tgc tac tec ata gaa eea ctg gat eta ect cca ate att caa aga etc 6927 
Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Pro He He Gin Arg Leu 
1635 1640 1645 

cat ggc etc age gea ttt tea etc cac agt tac tct cca ggt gaa ate 6975 
His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He 
1650 1655 1660 

aat agg gtg gcc gea tgc etc aga aaa ctt ggg gta ceg ccc ttg cga 7023 
Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg 
1665 1670 1675 

get tgg aga cac egg gcc egg age gtc egc get agg ctt ctg gee aga 7071 
Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg 
1680 1685 1690 

gga ggc agg get gee ata tgt ggc aag tac etc ttc aac tgg gea gta 7119 
Gly Gly Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val 
1695 1700 1705 1710 

aga aca aag etc aaa etc act cca ata gcg gcc get ggc cag ctg gac 7167 
Arg Thr Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly Gin Leu Asp 
1715 1720 1725 

ttg tec ggc tgg ttc acg get ggc tac age ggg gga gac att tat cac 7215 
Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp He Tyr His 
1730 1735 1740 

age gtg tet eat gee egg ccc egc tgg ate tgg ttt tgc eta etc ctg 7263 
Ser Val Ser His Ala Arg Pro Arg Trp He Trp Phe Cys Leu Leu Leu 
1745 1750 1755 

ctt get gea ggg gta ggc ate tac etc etc ccc aac cga tgaaggttgg 7312 
Leu Ala Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg 
1760 1765 1770 
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ggtaaacact 


ccggcctaaa 


aaaaaaaaaa 


aatctagaaa 


ggcgcgccaa 


gatatcaagg 


7372 


atccactacg 


cgttagagct 


cgctgatcag 


cctcgactgt 


gccttctagt 


tgccagccat 


7432 


ctgttgtttg 


cccctccccc 


gtgccttcct 


tgaccctgga 


aggtgccact 


cccactgtcc 


7492 


tttcctaata 


aaatgaggaa 


attgcatcgc 


attgtctgag 


taggtgtcat 


tctattctgg 


7552 


ggggtggggt 


ggggcaggac 


agcaaggggg 


aggattggga 


agacaatagc 


aggcatgctg 


7612 


gggagctctt 


ccgcttcctc 


gctcactgac 


tcgctgcgct 


cggtcgttcg 


gctgcggcga 


7672 


gcggtatcag 


ctcactcaaa 


ggcggtaata 


cggttatcca 


cagaatcagg 


ggataacgca 


7732 


ggaaagaaca 


tgtgagcaaa 


aggccagcaa 


aaggccagga 


accgtaaaaa 


ggccgcgttg 


7792 


ctggcgtttt 


tccataggct 


ccgcccccct 


gacgagcatc 


acaaaaatcg 


acgctcaagt 


7852 


cagaggtggc 


gaaacccgac 


aggactataa 


agataccagg 


cgtttccccc 


tggaagctcc 


7912 


ctcgtgcgct 


ctcctgttcc 


gaccctgccg 


cttaccggat 


acctgtccgc 


ctttctccct 


7972 


tcgggaagcg 


tggcgctttc 


tcaatgctca 


cgctgtaggt 


atctcagttc 


ggtgtaggtc 


8032 


gttcgctcca 


agctgggctg 


tgtgcacgaa 


ccccccgttc 


agcccgaccg 


ctgcgcctta 


8092 


tccggtaact 


atcgtcttga 


gtccaacccg 


gtaagacacg 


acttatcgcc 


actggcagca 


8152 


gccactggta 


acaggattag 


cagagcgagg 


tatgtaggcg 


gtgctacaga 


gttcttgaag 


8212 


tggtggccta 


actacggcta 


cactagaagg 


acagtatttg 


gtatctgcgc 


tctgctgaag 


8272 


ccagttacct 


tcggaaaaag 


agttggtagc 


tcttgatccg 


gcaaacaaac 


caccgctggt 


8332 


agcggtggtt 


tttttgtttg 


caagcagcag 


attacgcgca 


gaaaaaaagg 


atctcaagaa 


8392 


gatcctttga 


tcttttctac 


ggggtctgac 


gctcagtgga 


acgaaaactc 


acgttaaggg 


8452 


attttggtca 


tgagattatc 


aaaaaggatc 


ttcacctaga 


tccttttaaa 


ttaaaaatga 


8512 


agttttaaat 


caatctaaag 


tatatatgag 


taaacttggt 


ctgacagtta 


ccaatgctta 


8572 


atcagtgagg 


cacctatctc 


agcgatctgt 


ctatttcgtt 


catccatagt 


tgcctgactc 


8632 


cccgtcgtgt 


agataactac 


gatacgggag 


ggcttaccat 


ctggccccag 


tgctgcaatg 


8692 


ataccgcgag 


acccacgctc 


accggctcca 


gatttatcag 


caataaacca 


gccagccgga 


8752 


agggccgagc 


gcagaagtgg 


tcctgcaact 


ttatccgcct 


ccatccagtc 


tattaattgt 


8812 


tgccgggaag 


ctagagtaag 


tagttcgcca 


gttaatagtt 


tgcgcaacgt 


tgttgccatt 


8872 


gctacaggca 


tcgtggtgtc 


acgctcgtcg 


tttggtatgg 


cttcattcag 


ctccggttcc 


8932 


caacgatcaa 


ggcgagttac 


atgatccccc 


atgttgtgca 


aaaaagcggt 


tagctccttc 


8992 


ggtcctccga 


tcgttgtcag 


aagtaagttg 


gccgcagtgt 


tatcactcat 


ggttatggca 


9052 


gcactgcata 


attctcttac 


tgtcatgcca 


tccgtaagat 


gcttttctgt 


gactggtgag 


9112 
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tactcaacca agtcattctg agaatagtgt atgcggcgac cgagttgctc ttgcccggcg 9172 
tcaatacggg ataataccgc gccacatagc agaactttaa aagtgctcat cattggaaaa 9232 
cgttcttcgg ggcgaaaact ctcaaggatc ttaccgctgt tgagatccag ttcgatgtaa 9292 
cccactcgtg cacccaactg atcttcagca tcttttactt tcaccagcgt ttctgggtga 9352 
gcaaaaacag gaaggcaaaa tgccgcaaaa aagggaataa gggcgacacg gaaatgttga 9412 
atactcatac tcttcctttt tcaatattat tgaagcattt atcagggtta ttgtctcatg 9472 
agcggataca tatttgaatg tatttagaaa aataaacaaa taggggttcc gcgcacattt 9532 
ccccgaaaag tgccacctga cgtctaagaa accattatta tcatgacatt aacctataaa 9592 
aataggcgta tcacgaggcc ctttcgtc 9620 



<210> 2 
<211> 1771 
<212> PRT 

<213> Hepatitis C virus 
<220> 

<223> Description of Artificial Sequence: Hepatitis C pns345 
<400> 2 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
15 10 15 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His 

20 25 30 

Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr Thr Gly 
35 40 45 

Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly 
50 55 60 

Cys Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser 
65 70 75 80 

Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala 
85 90 95 

Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro 
100 105 110 

Gly Ser Val Thr Val Pro His Pro Asn lie Glu Glu Val Ala Leu Ser 
115 120 125 

Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val 
130 135 140 

He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys 
145 150 155 160 

Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala 
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165 



170 



175 



Tyr Tyr Arg Gly Leu Asp Val Ser Val lie Pro Thr Ser Gly Asp Val 
180 185 190 

Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 
195 200 205 . 

Asp Ser Val lie Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
210 215 220 

Ser Leu Asp Pro Thr Phe Thr He Glu Thr He Thr Leu Pro Gin Asp 
225 230 235 240 

Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro 
245 250 255 

Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe 

260 265 270 

Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr 
275 280 285 

Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn 
290 . 295 300 

Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly 
305 310 315 320 

Val Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr 
325 330 335 

Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr 
340 345 350 

Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp 
355 360 365 

Lys Cys Leu He Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu 
370 375 380 

Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu He Thr Leu Thr His Pro 
385 390 395 400 

Val Thr Lys Tyr He Met Thr Cys Met Ser Ala Asp Leu Glu Val Val 
405 410 415 

Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
420 425 430 

Tyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg Val Val Leu 

435 440 445 

Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Arg Glu 
450 455 460 



Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin 
465 470 475 480 
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Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu 
485 490 495 

Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala Val Gin Thr 
500 505 510 

Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe 
515 520 525 

He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 
530 535 540 

Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro 

545 550 555 560 

Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val 
565 570 575 

Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 
580 585 590 

Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly Leu Gly Lys Val Leu 
595 600 605 

He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 
610 615 620 



Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 
625 630 635 640 

Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val 
645 650 655 

Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
660 665 670 

Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 
675 680 685 

Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
690 695 700 

Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu 
705 710 715 720 

His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp 
725 730 735 

Leu Arg Asp He Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys 
740 745 750 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe 
755 760 765 

Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He 
770 775 780 

Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His Val Lys 
785 790 795 800 
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Asn Gly Thr Met Arg lie Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
805 810 815 

Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
820 825 830 

Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser Ala Glu 
835 840 845 

Glu Tyr Val Glu lie Arg Gin Val Gly Asp Phe His Tyr Val Thr Gly 
850 855 860 

Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser Pro Glu 
865 870 875 880 

Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro 
885 890 895 

Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His 
900 905 910 

Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val 
915 920 925 

Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu 
930 935 940 

Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala Ser 
945 950 955 960 

Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 
965 970 975 

Ala Asn His Asp Ser Pro Asp Ala Glu Leu He Glu Ala Asn Leu Leu 
980 985 990 

Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val Glu Ser Glu Asn 
995 1000 1005 



Lys Val Val He Leu Asp Ser Phe 
1010 1015 

Glu Arg Glu He Ser Val Pro Ala 
025 1030 

Phe Ala Gin Ala Leu Pro Val Trp 

1045 

Leu Val Glu Thr Trp Lys Lys Pro 
1060 



Asp Pro Leu Val Ala Glu Glu Asp 
1020 

Glu He Leu Arg Lys Ser Arg Arg 

1035 1040 

Ala Arg Pro Asp Tyr Asn Pro Pro 
1050 1055 

Asp Tyr Glu Pro Pro Val Val His 
.065 1070 



Gly Cys Pro Leu Pro Pro Pro Lys 
1075 1080 

Lys Lys Arg Thr Val Val Leu Thr 
1090 1095 

Ala Glu Leu Ala Thr Arg Ser Phe 
105 1110 



Ser Pro Pro Val Pro Pro Pro Arg 
1085 

Glu Ser Thr Leu Ser Thr Ala Leu 
1100 

Gly Ser Ser Ser Thr Ser Gly He 
1115 1120 
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Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys 
1125 1130 1135 

Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu 
1140 1145 1150 

Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val 
1155 1160 1165 



Ser Ser Glu Ala Asn Ala Glu Asp 
1170 1175 

Ser Trp Thr Gly Ala Leu Val Thr 
185 1190 

Leu Pro lie Asn Ala Leu Ser Asn 
1205 

Val Tyr Ser Thr 
1220 

Thr Phe Asp Arg 
1235 

Lys Glu Val Lys 
1250 



Val Val Cys Cys Ser Met Ser Tyr 
1180 

Pro Cys Ala Ala Glu Glu Gin Lys 
1195 1200 

Ser Leu Leu Arg His His Asn Leu 
1210 1215 

Cys Gin Arg Gin Lys Lys Val 
1230 

Ser His Tyr Gin Asp Val Leu 
1245 

Val Lys Ala Asn Leu Leu Ser 
1260 



Thr Ser Arg Ser Ala 
1225 

Leu Gin Val Leu Asp 
1240 

Ala Ala Ala Ser Lys 

1255 



Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys 
265 1270 1275 1280 

Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val 
1285 1290 1295 

Thr His lie Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr 
1300 1305 1310 

Pro He Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin 
1315 1320 1325 

Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp 
1330 1335 1340 

Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr 
345 1350 1355 1360 

Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser 
1365 1370 1375 

Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys 
1380 1385 1390 

Thr Pro Met Gly Phe Ser Tyr Asp thr Arg Cys Phe Asp Ser Thr Val 
1395 1400 1405 

Thr Glu Ser Asp He Arg Thr Glu Glu Ala He Tyr Gin Cys Cys Asp 
1410 1415 1420 

Leu Asp Pro Gin Ala Arg Val Ala He Lys Ser Leu Thr Glu Arg Leu 
425 1430 1435 1440 
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Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr 
1445 1450 1455 

Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr 
1460 1465 1470 

Leu Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu 
1475 1480 1485 

Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys 
1490 1495 1500 

Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr 
505 1510 1515 1520 

Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro 
1525 1530 1535 

Glu Tyr Asp Leu Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val 
1540 1545 1550 

Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro 
1555 1560 1565 

Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro 

1570 1575 1580 

Val Asn Ser Trp Leu Gly Asn He He Met Phe Ala Pro Thr Leu Trp 
585 1590 1595 1600 

Ala Arg Met He Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg 
1605 1610 1615 

Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He Tyr Gly Ala Cys Tyr 
1620 1625 1630 

Ser He Glu Pro Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly 
1635 1640 1645 

Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg 
1650 1655 1660 

Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp 
665 1670 1675 1680 

Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly 
1685 1690 1695 

Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr 
1700 1705 1710 

Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly Gin Leu Asp Leu Ser 
1715 1720 1725 

Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val 
1730 1735 1740 

Ser His Ala Arg Pro Arg Trp He Trp Phe Cys Leu Leu Leu Leu Ala 
745 1750 1755 1760 
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Ala Gly Val Gly lie Tyr Leu Leu Pro Asn Arg 
1765 1770 



<210> 3 
<211> 9620 
<212> DNA 

<213> Artificial Sequence 

<220> 
<221> CDS 

<222> (1990) . , (7302) 
<220> 

<223> Description of Artificial Sequence: pDelteiNS3NS5 
<400> 3 



cgcgcgtttc 


ggtgatgacg 


gtgaaaacct 


ctgacacatg 


cagctcccgg 


agacggtcac 


60 


agcttgtctg 


taagcggatg 


ccgggagcag 


acaagcccgt 


cagggcgcgt 


cagcgggtgt 


120 


tggcgggtgt 


cggggctggc 


ttaactatgc 


ggcatcagag 


cagattgtac 


tgagagtgca 


180 


ccatatgaag 


ctttttgcaa 


aagcctaggc 


ctccaaaaaa 


gcctcctcac 


tacttctgga 


240 


atagctcaga 


ggccgaggcg 


gcctcggcct 


ctgcataaat 


aaaaaaaatt 


agtcagccat 


300 


ggggcggaga 


atgggcggaa 


ctgggcgggg 


agggaattat 


tggctattgg 


ccattgcata 


360 


cgttgtatct 


atatcataat 


atgtacattt 


atattggctc 


atgtccaata 


tgaccgccat 


420 


gttgacattg 


attattgact 


agttattaat 


agtaatcaat 


tacggggtca 


ttagttcata 


480 


gcccatatat 


ggagttccgc 


gttacataac 


ttacggtaaa 


tggcccgcct 


ggctgaccgc 


540 


ccaacgaccc 


ccgcccattg 


acgtcaataa 


tgacgtatgt 


tcccatagta 


acgccaatag 


600 


ggactttcca 


ttgacgtcaa 


tgggtggagt 


atttacggta 


aactgcccac 


ttggcagtac 


660 


atcaagtgta 


tcatatgcca 


agtccgcccc 


ctattgacgt 


caatgacggt 


aaatggcccg 


720 


cctggcatta 


tgcccagtac 


atgaccttac 


gggactttcc 


tacttggcag 


tacatctacg 


780 


tattagtcat 


cgctattacc 


atggtgatgc 


ggttttggca 


gtacaccaat 


gggcgtggat 


840 


agcggtttga 


ctcacgggga 


tttccaagtc 


tccaccccat 


tgacgtcaat 


gggagtttgt 


900 


tttggcacca 


aaatcaacgg 


gactttccaa 


aatgtcgtaa 


taaccccgcc 


ccgttgacgc 


960 


aaatgggcgg 


taggcgtgta 


cggtgggagg 


tctatataag 


cagagctcgt 


ttagtgaacc 


1020 


gtcagatcgc 


ctggagacgc 


catccacgct 


gttttgacct 


ccatagaaga 


caccgggacc 


1080 


gatccagcct 


ccgcggccgg 


gaacggtgca 


ttggaacgcg 


gattccccgt 


gccaagagtg 


1140 


acgtaagtac 


cgcctataga 


ctctataggc 


acaccccttt 


ggctcttatg 


catgctatac 


1200 


tgtttttggc 


ttggggccta 


tacacccccg 


ctccttatgc 


tataggtgat 


ggtatagctt 


1260 
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agcctatagg tgtgggttat tgaccattat tgaccactcc cctattggtg acgatacttt 1320 

ccattactaa tccataacat ggctctttgc cacaactatc tctattggct atatgccaat 1380 

actctgtcct tcagagactg acacggactc tgtattttta caggatgggg tccatttatt 1440 

atttacaaat tcacatatac aacaacgccg tcccccgtgc ccgcagtttt tattaaacat 1500 

agcgtgggat ctccgacatc tcgggtacgt gttccggaca tgggctcttc tccggtagcg 1560 

gcggagcttc cacatccgag ccctggtccc atccgtccag cggctcatgg tcgctcggca 1620 

gctccttgct cctaacagtg gaggccagac ttaggcacag cacaatgccc accaccacca 1680 

gtgtgccgca caaggccgtg gcggtagggt atgtgtctga aaatgagctc ggagattggg 1740 

ctcgcacctg gacgcagatg gaagacttaa ggcagcggca gaagaagatg caggcagctg 1800 

agttgttgta ttctgataag agtcagaggt aactcccgtt gcggtgctgt taacggtgga 1860 

gggcagtgta gtctgagcag tactcgttgc tgccgcgcgc gccaccagac ataatagctg 1920 

acagactaac agactgttcc tttccatggg tcttttctgc agtcaccgtc gtcgacctaa 1980 

gaattcacc atg get gca tat gca get cag ggc tat aag gtg eta gta ete 2031 
Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu 
15 10 

aac ccc tct gtt get gca aca ctg ggc ttt ggt get tac atg tec aag 2079 
Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys 
15 20 25 30 

get eat ggg ate gat ect aac ate agg ace ggg gtg aga aca att ace 212 7 
Ala His Gly lie Asp Pro Asn He Arg Thr Gly Val Arg Thr lie Thr 
35 40 45 

act ggc age ccc ate acg tac tec ace tac gge aag ttc ett gee gac 2175 
Thr Gly Ser Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp 
50 55 60 

ggc ggg tgc teg ggg gge get tat gac ata ata att tgt gae gag tgc 2223 
Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys 
65 70 75 

cac tec acg gat gee aca tec ate ttg gge att ggc act gtc ctt gae 2271 
His Ser Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp 
80 85 90 

caa gca gag act gcg ggg gcg aga ctg gtt gtg etc gee ace gee ace 2319 
Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr 
95 100 105 110 

ect ceg gge tee gtc act gtg ccc cat ccc aac ate gag gag gtt get 2367 
Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala 
115 120 125 

ctg tec ace acc gga gag ate ect ttt tac ggc aag get ate ccc etc 2415 
Leu Ser Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu 
130 135 140 
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gaa gta ate aag ggg ggg aga cat etc ate ttc tgt cat tea aag aag 2463 
Glu Val lie Lys Gly Gly Arg His Leu lie Phe Cys His Ser Lys Lys 
145 150 155 

aag tgc gac gaa etc gee gca aag ctg gtc gca ttg ggc ate aat gcc 2511 
Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly lie Asn Ala 
160 165 170 

gtg gcc tac tac cgc ggt ett gac gtg tec gtc ate ccg acc age ggc 2559 
Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val lie Pro Thr Ser Gly 
175 180 185 190 

gat gtt gtc gtc gtg gca acc gat gcc etc atg ace ggc tat ace ggc 2607 
Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly 
195 200 205 

gac tte gac teg gtg ata gac tgc aat acg tgt gtc acc cag aca gtc 2655 
Asp Phe Asp Ser Val lie Asp Cys Asn Thr Cys Val Thr Gin Thr Val 
210 215 220 

gat ttc age ett gac cet ace tte ace att gag aca ate acg etc cec 2703 
Asp Phe Ser Leu Asp Pro Thr Phe Thr lie Glu Thr lie Thr Leu Pro 
225 230 235 

caa gat get gtc tec cgc act caa cgt egg ggc agg act ggc agg ggg 2751 
Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly 
240 245 250 

aag cea ggc ate tac aga ttt gtg gca ccg ggg gag cgc cce tec ggc 2799 
Lys Pro Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly 
255 260 265 270 

atg ttc gac teg tec gtc etc tgt gag tgc tat gac gca ggc tgt get 2847 
Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala 
275 280 285 

tgg tat gag etc acg cce gee gag act aca gtt agg eta cga gcg tac 2895 
Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr 
290 295 300 

atg aac ace ccg ggg ett cce gtg tgc cag gac eat ett gaa ttt tgg 2943 
Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp 
305 310 315 

gag ggc gtc ttt aca ggc etc act cat ata gat gee cae ttt eta tec 2991 
Glu Gly Val Phe Thr Gly Leu Thr His lie Asp Ala His Phe Leu Ser 
320 325 330 

cag aca aag cag agt ggg gag aac ett cet tac ctg gta gcg tac caa 3039 
Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin 
335 340 345 350 

gcc ace gtg tgc get agg get caa gcc cet cce cca teg tgg gac cag 3087 
Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin 
355 360 365 

atg tgg aag tgt ttg att cgc etc aag cec ace etc cat ggg cea aca 3135 
Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu His Gly Pro Thr 
370 375 380 
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ccc ctg eta tac aga ctg ggc get gtt cag aat gaa ate ace ctg acg 3183 
Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie Thr Leu Thr 
385 390 395 

cac cca gtc acc aaa tac ate atg aca tgc atg teg gee gac ctg gag 3231 
His Pro Val Thr Lys Tyr lie Met Thr Cys Met Ser Ala Asp Leu Glu 
400 405 410 

gtc gtc acg age acc tgg gtg etc gtt ggc ggc gtc ctg get get ttg 3279 
Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu 
415 420 425 430 

gee gcg tat tgc ctg tea aca ggc tgc gtg gtc ata gtg ggc agg gtc 3327 
Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val lie Val Gly Arg Val 
435 440 445 

gtc ttg tec ggg aag ccg gca ate ata ect gac agg gaa gtc etc tac 3375 
Val Leu Ser Gly Lys Pro Ala lie lie Pro Asp Arg Glu Val Leu Tyr 
450 455 460 

cga gag ttc gat gag atg gaa gag tgc tct cag cac tta ccg tac ate 3423 
Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr lie 
465 470 475 

gag caa ggg atg atg etc gee gag cag ttc aag cag aag gcc etc ggc 3471 
Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly 
480 485 490 

etc ctg cag acc gcg tec cgt cag gca gag gtt ate gcc ect get gtc 3519 
Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val lie Ala Pro Ala Val 
495 500 505 510 

cag acc aac tgg caa aaa etc gag acc ttc tgg gcg aag cat atg tgg 3567 
Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp 
515 520 525 

aac ttc ate agt ggg ata caa tac ttg gcg ggc ttg tea acg ctg ect 3615 
Asn Phe lie Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro 
530 535 540 

ggt aac ccc gcc att get tea ttg atg get ttt aca get get gtc acc 3663 
Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr 
545 550 555 

age cca eta acc act age caa acc etc etc ttc aac ata ttg ggg ggg 3711 
Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He Leu Gly Gly 
560 565 570 

tgg gtg get gcc cag etc gee gcc ccc ggt gcc get act gcc ttt gtg 3759 
Trp Val Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val 
575 580 585 590 

ggc get ggc tta get ggc gcc gcc ate ggc agt gtt gga ctg ggg aag 3807 
Gly Ala Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly Leu Gly Lys 
595 600 605 

gtc etc ata gac ate ctt gca ggg tat ggc gcg ggc gtg gcg gga get 3855 
Val Leu He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala 
610 615 620 
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ctt gtg gca ttc aag ate atg age ggt gag gte cee tec aeg gag gae 3903 
Leu Val Ala Phe Lys lie Met Ser Gly Glu Val Pro Ser Thr Glu Asp 
625 630 635 

ctg gtc aat eta ctg ccc gcc ate etc teg cec gga gee etc gta gte 3951 
Leu Val Asn Leu Leu Pro Ala lie Leu Ser Pro Gly Ala Leu Val Val 
640 645 650 

ggc gtg gtc tgt gca gca ata ctg cgc egg cac gtt gge ccg ggc gag 3999 
Gly Val Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu 
655 660 665 670 

999 9tg cag tgg atg aac egg ctg ata gcc ttc gcc tec egg ggg 4047 

Gly Ala Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly 
675 680 685 

aac cat gtt tee ecc aeg cac tac gtg ccg gag age gat gca get gcc 4095 
Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala 
690 695 700 

cgc gtc act gee ata etc age age etc act gta ace cag etc ctg agg 4143 
Arg Val Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg 
705 710 715 

cga ctg cac cag tgg ata age teg gag tgt ace act cca tgc tec ggt 4191 
Arg Leu His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly 
720 725 730 

tee tgg eta agg gac ate tgg gae tgg ata tgc gag gtg ttg age gac 423 9 
Ser Trp Leu Arg Asp He Trp Asp Trp He Cys Glu Val Leu Ser Asp 
735 740 745 750 

ttt aag ace tgg eta aaa get aag etc atg cca cag ctg cct ggg ate 4287 
Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He 
755 760 765 

ccc ttt gtg tec tgc cag cgc ggg tat aag ggg gtc tgg cga ggg gae 4335 
Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp 
770 775 780 

ggc ate atg cac act cgc tgc cac tgt gga get gag ate act gga cat 4383 
Gly He Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His 
785 790 795 

gtc aaa aac ggg aeg atg agg ate gtc ggt cct agg ace tgc agg aac 4431 
Val Lys Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn 
800 805 810 

atg tgg agt ggg acc ttc ccc att aat gcc tac ace aeg ggc ccc tgt 4479 
Met Trp Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys 
815 820 825 830 

acc ccc ctt cct gcg ccg aac tac aeg ttc gcg eta tgg agg gtg tet 4527 
Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser 
835 840 845 

gca gag gaa tac gtg gag ata agg cag gtg ggg gac ttc cac tac gtg 4575 
Ala Glu Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe His Tyr Val 
850 855 860 
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acg ggt atg act act gac aat ctt aaa tgc ccg tgc cag gtc cca teg 4623 
Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser 
865 870 875 

ccc gaa ttt ttc aca gaa ttg gac ggg gtg cgc eta cat agg ttt gcg 4671 
Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala 
880 885 890 

ccc ccc tgc aag ccc ttg ctg egg gag gag gta tea ttc aga gta gga 4719 
Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly 
895 900 905 910 

etc cac gaa tac ccg gta ggg teg caa tta cet tgc gag ccc gaa ccg 4767 
Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro 
915 920 925 

gac gtg gee gtg ttg acg tec atg etc act gat ccc tec cat ata aca 4815 
Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His He Thr 

930 935 940 

gea gag gcg gee ggg ega agg ttg gcg agg gga tea ccc ccc tct gtg 4863 
Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val 
945 950 955 

gcc age tee teg get age eag eta tec get cca tet etc aag gea act 4911 
Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr 
960 965 970 

tgc ace get aac cat gac tec cet gat get gag etc ata gag gcc aac 4959 
Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu He Glu Ala Asn 
975 980 985 990 

etc eta tgg agg eag gag atg ggc ggc aac ate ace agg gtt gag tea 5007 
Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He Thr Arg Val Glu Ser 
995 1000 1005 

gaa aac aaa gtg gtg att ctg gac tec ttc gat ccg ctt gtg gcg gag 5055 
Glu Asn Lys Val Val He Leu Asp Ser Phe Asp Pro Leu Val Ala Glu 
1010 1015 1020 

gag gac gag egg gag ate tee gta ccc gea gaa ate ctg egg aag tct 5103 
Glu Asp Glu Arg Glu He Ser Val Pro Ala Glu He Leu Arg Lys Ser 
1025 1030 1035 

egg aga ttc gcc cag gcc ctg ccc gtt tgg gcg egg ccg gac tat aac 5151 
Arg Arg Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn 
1040 1045 1050 

ccc ccg eta gtg gag acg tgg aaa aag ccc gac tac gaa cca cet gtg 5199 
Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val 
1055 1060 1065 1070 

gtc eat ggc tgc ccg ctt cca cet cca aag tec cet cet gtg cet ccg 5247 
Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro 
1075 1080 1085 

cet egg aag aag egg acg gtg gtc etc act gaa tea acc eta tct act 5295 
Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr 
1090 1095 1100 
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gcc ttg gcc gag etc gcc acc aga age ttt ggc age tec tea act tec 5343 
Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser 
1105 1110 1115 

ggc att aeg ggc gac aat acg aca aca tec tct gag ecc gcc cct tct 5391 
Gly lie Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser 
1120 1125 1130 

ggc tgc cce ecc gac tec gac get gag tec tat tec tec atg ccc cec 5439 
Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro 
1135 1140 1145 1150 

ctg gag ggg gag cct ggg gat ecg gat ctt age gac ggg tea tgg tea 5487 
Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser 
1155 1160 1165 

aeg gtc agt agt gag gcc aac gcg gag gat gtc gtg tgc tgc tea atg 5535 
Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met 
1170 1175 1180 

tct tac tct tgg aca ggc gca etc gtc ace ecg tgc gee gcg gaa gaa 5583 
Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu 
1185 1190 1195 

cag aaa etg ecc ate aat gca eta age aac teg ttg eta egt eac cac 5631 
Gin Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His 
1200 1205 1210 

aat ttg gtg tat tec ace acc tea cgc agt get tgc caa agg cag aag 5679 
Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys 
1215 1220 1225 1230 

aaa gtc aca ttt gac aga etg caa gtt etg gac age eat tac cag gac 5727 
Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp 
1235 1240 1245 

gta etc aag gag gtt aaa gca gcg gcg tea aaa gtg aag get aac ttg 5775 
Val Leu Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu 
1250 1255 1260 

eta tec gta gag gaa get tgc age etg acg cec eca eac tea gee aaa 5823 
Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys 
1265 1270 1275 

tec aag ttt ggt tat ggg gca aaa gac gtc egt tgc cat gcc aga aag 5871 
Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys 
1280 1285 1290 

gcc gta acc eac ate aac tec gtg tgg aaa gac ctt etg gaa gac aat 5919 
Ala Val Thr His He Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn 
1295 1300 1305 1310 

gta aca eca ata gac act acc ate atg get aag aac gag gtt tte tgc 5967 
Val Thr Pro He Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys 
1315 1320 1325 

gtt cag cct gag aag ggg ggt egt aag eca get egt etc ate gtg ttc 6015 
Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe 
1330 1335 1340 
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ccc gat ctg ggc gtg cgc gtg tgc gaa aag atg get ttg tac gac gtg 6063 
Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val 
1345 1350 1355 

gtt aca aag etc ccc ttg gcc gtg atg gga age tec tac gga ttc caa 6111 
Val Thr Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin 
1360 1365 1370 

tac tea cca gga cag egg gtt gaa ttc etc gtg caa geg tgg aag tec 6159 
Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser 
1375 1380 1385 1390 

aag aaa acc cca atg ggg ttc teg tat gat acc cgc tgc ttt gac tec 6207 
Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser 
1395 1400 1405 

aca gtc act gag age gac ate cgt aeg gag gag gca ate tac caa tgt 6255 
Thr Val Thr Glu Ser Asp lie Arg Thr Glu Glu Ala He Tyr Gin Cys 
1410 1415 1420 

tgt gac etc gac ccc caa gcc cgc gtg gcc ate aag tec etc acc gag 6303 
Cys Asp Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser Leu Thr Glu 
1425 1430 1435 

agg ctt tat gtt ggg ggc cct ctt acc aat tea agg ggg gag aac tgc 6351 
Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys 
1440 1445 1450 

ggc tat cgc agg tgc cgc geg age ggc gta ctg aca act age tgt ggt 6399 
Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly 
1455 1460 1465 1470 

aac acc etc act tgc tac ate aag gcc egg gca gcc tgt cga gcc gca 6447 
Asn Thr Leu Thr Cys Tyr He Lys Ala Arg Ala Ala Cys Arg Ala Ala 
1475 1480 1485 

ggg etc cag gac tgc acc atg etc gtg tgt ggc gac gac tta gtc gtt 6495 
Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val 
1490 1495 1500 

ate tgt gaa age geg ggg gtc cag gag gac geg geg age ctg aga gcc 6543 
He Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala 
1505 1510 1515 

ttc aeg gag get atg acc agg tac tec gcc ccc cct ggg gac ccc cca 6591 
Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro 
1520 1525 1530 

caa cca gaa tac gac ttg gag etc ata aca tea tgc tec tee aac gtg 6639 
Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser Cys Ser Ser Asn Val 
1535 1540 1545 1550 

tea gtc gee cac gac ggc get gga aag agg gtc tac tac etc acc cgt 6687 
Ser Val Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg 
1555 1560 1565 

gac* cct aca ace ccc etc geg aga get geg tgg gag aca gca aga cac 6735 
Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His 
1570 1575 1580 
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act cca gtc aat tec tgg eta ggc aae ata ate atg ttt gee ece aea 6783 
Thr Pro Val Asn Ser Trp Leu Gly Asn He He Met Phe Ala Pro Thr 
1585 1590 1595 

ctg tgg gcg agg atg ata ctg atg acc cat ttc ttt age gtc ctt ata 6831 
Leu Trp Ala Arg Met He Leu Met Thr His Phe Phe Ser Val Leu He 
1600 1605 1610 

gee agg gac cag ctt gaa cag gee etc gat tge gag ate tac ggg gee 6879 
Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He Tyr Gly Ala 
1615 1620 1625 1630 

tgc tac tec ata gaa cca ctg gat eta cet cea ate att caa aga etc 6927 
Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Pro He He Gin Arg Leu 
1635 1640 1645 

cat gge etc age gca ttt tea etc cae agt tac tet cca ggt gaa ate 6975 
His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He 
1650 1655 1660 

aat agg gtg gcc gca tgc etc aga aaa ctt ggg gta ecg ece ttg ega 7023 
Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg 
1665 1670 1675 

get tgg aga cae egg gee egg age gtc cgc get agg ctt ctg gee aga 7071 
Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg 
1680 1685 1690 

99cL 99c agg get gcc ata tgt ggc aag tac etc ttc aae tgg gca gta 7119 
Gly Gly Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val 
1695 1700 1705 1710 

aga aea aag etc aaa etc act cea ata gcg gee get gge cag ctg gac 7167 
Arg Thr Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly Gin Leu Asp 
1715 1720 1725 

ttg tee gge tgg ttc aeg get gge tac age ggg gga gac att tat cae 7215 
Leu Ser Gly Trp Phe Thr Ala Gly Tyr Sier Gly Gly Asp He Tyr His 
1730 1735 1740 

age gtg tet eat gee egg ece cgc tgg ate tgg ttt tge eta etc ctg 7263 
Ser Val Ser His Ala Arg Pro Arg Trp He Trp Phe Cys Leu Leu Leu 
1745 1750 1755 



ctt get gca ggg gta ggc ate tac etc etc ece aae ega tgaaggttgg 7312 
Leu Ala Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg 



1760 




1765 


1770 






ggtaaacact 


ecggeetaaa 


aaaaaaaaaa 


aatctagaaa 


ggegcgeeaa 


gatateaagg 


7372 


atceactacg 


egttagaget 


egctgateag 


ectcgactgt 


gcettctagt 


tgccagccat 


7432 


ctgttgtttg 


ccceteccec 


gtgecttcet 


tgaccctgga 


aggtgceact 


eccactgtec 


7492 


tttcetaata 


aaatgaggaa 


attgeatege 


attgtctgag 


taggtgteat 


tetattctgg 


7552 


g9ggtggggt 


ggggcaggac 


ageaaggggg 


aggattggga 


agaeaatagc 


aggeatgctg 


7612 


gggagetett 


ccgettcetc 


gcteaetgae 


tcgctgeget 


eggtegtteg 


getgcggcga 


7672 
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cggcuaucca 


cagaatcagg 


ggataacgca 


7732 




cgcgagcaaa 


aggccagcaa 


aaggccagga 


accgtaaaaa 


ggccgcgttg 


7792 


^ ^ ^^^^^^^^ 4* ^ 4» 


cccacaggcc 


ccgcccccct 


gacgagcatc 


acaaaaatcg 


acgctcaagt 


7852 


cagaggtggc 


gaaacccgac 


aggactataa 


agataccagg 


cgtttccccc 


tggaagctcc 


7912 


cCcgtgcgct 


_ ^ _ _ ^ ^ ^ M «^ 
cccccgcccc 


gaccctgccg 


cttaccggat 


acctgtccgc 


ctttctccct 


7972 


tcgggaagcg 


cggcgccccc 


tcaatgctca 


cgctgtaggt 


atctcagttc 


ggtgtaggtc 


8032 


gttcgctcca 


agctgggctg 


tgtgcacgaa 


ccccccgttc 


agcccgaccg 


ctgcgcctta 


8092 


tccggtaact 


— A- mm4v — a. ^ ^ 

accgucccga 


gtccaacccg 


gtaagacacg 


acttatcgcc 


actggcagca 


8152 


gccactggta 


acaggattag 


cagagcgagg 


tatgtaggcg 


gtgctacaga 


gttcttgaag 


8212 


tggtggccta 


actacggctia 


cactagaagg 


acagtatttg 


gtatctgcgc 


tctgctgaag 


8272 


ccagtztacct 


tcggaaaaag 


agttggtagc 


tcttgatccg 


gcaaacaaac 


caccgctggt 


8332 


agcggtggt^ 


cccccgcccg 


caagcagcag 


attacgcgca 


gaaaaaaagg 


atctcaagaa 


8392 


gaCcctttga 


cccrctccac 


ggggtctgac 


gctcagtgga 


acgaaaactc 


acgttaaggg 


8452 


actccggcca 


tgagattatc 


aaaaaggatc 


ttcacctaga 


tccttttaaa 


ttaaaaatga 


8512 


agttttaaat 


caatctaaag 


tatatatgag 


taaacttggt 


ctgacagtta 


ccaatgctta 


8572 


atcagtgagg 


cacctatctc 


agcgatctgt 


ctatttcgtt 


catccatagt 


tgcctgactc 


8632 


cccgtcgtgt 


agataactac 


gatacgggag 


ggcttaccat 


ctggccccag 


tgctgcaatg 


8692 


ataccgcgag 


acccacgctc 


accggctcca 


gatttatcag 


caataaacca 


gccagccgga 


8752 


^999ccgagc 


gcagaagtgg 


tcctgcaact 


ttatccgcct 


ccatccagtc 


tattaattgt 


8812 


tgccgggaag 


ctagagtaag 


tagttcgcca 


gttaatagtt 


tgcgcaacgt 


tgttgccatt 


8872 


gctacaggca 


tcgtggtgtc 


acgctcgtcg 


tttggtatgg 


cttcattcag 


ctccggttcc 


8932 


caacgatcaa 


ggcgagttac 


atgatccccc 


atgttgtgca 


aaaaagcggt 


tagctccttc 


8992 


gg^cctccga 


tcgttgtcag 


aagtaagttg 


gccgcagtgt 


tatcactcat 


ggttatggca 


9052 


gcactigcata 


attctcttac 


tgtcatgcca 


tccgtaagat 


gcttttctgt 


gactggtgag 


9112 




A ♦* ^% ^ ^ ^ 

agccac ucug 


agaat^agCgt^ 


atgcggcgac 


cgagttgctc 


ttgcccggcg 


9172 


^caa^acggg 


atiaat.accgc 


gccacatagc 


agaactttaa 


aagtgctcat 


cattggaaaa 


9232 


cgttcttcgg 


ggcgaaaact 


ctcaaggatc 


ttaccgctgt 


tgagatccag 


ttcgatgtaa 


9292 


cccactcgtg 


cacccaactg 


atcttcagca 


tcttttactt 


tcaccagcgt 


ttctgggtga 


9352 


gcaaaaacag 


gaaggcaaaa 


tgccgcaaaa 


aagggaataa 


gggcgacacg 


gaaatgttga 


9412 


atactcatac 


tcttcctttt 


tcaatattat 


tgaagcattt 


atcagggtta 


ttgtctcatg 


9472 
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agcggataca tatttgaatg tatttagaaa aataaacaaa taggggttcc gcgcacattt 9532 
ccccgaaaag tgccacctga cgtctaagaa accattatta tcatgacatt aacctataaa 9592 
aataggcgta tcacgaggcc ctttcgtc 9620 



<210> 4 
<211> 1771 
<212> PRT 

<213> Artificial Sequence 
<400> 4 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
15 10 15 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His 
20 25 30 

Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg Thr He Thr Thr Gly 
35 40 45 

Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly 
50 55 60 

Cys Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser 
65 70 75 80 

Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala 
85 90 95 

Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro 
100 105 110 

Gly Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser 
115 120 125 

Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val 
130 135 140 

He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys 
145 150 155 160 

Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala 
165 170 175 

Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val 
180 185 190 

Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 
195 200 205 

Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
210 215 220 

Ser Leu Asp Pro Thr Phe Thr He Glu Thr He Thr Leu Pro Gin Asp 
225 230 235 240 

Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro 
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245 



250 



255 



Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe 
260 265 270 

Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr 
275 280 285 

Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn 
290 295 300 

Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly 
305 310 315 320 

Val Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr 
325 330 335 

Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr 
340 345 350 

Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp 
355 360 365 

Lys Cys Leu He Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu 
370 375 380 

Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu He Thr Leu Thr His Pro 
385 390 395 400 

Val Thr Lys Tyr He Met Thr Cys Met Ser Ala Asp Leu Glu Val Val 
405 410 415 

Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
420 425 430 

Tyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg Val Val Leu 
435 440 445 

Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Arg Glu 
450 455 460 

Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin 
465 470 475 480 

Gly Net Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu 
485 490 495 

Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala Val Gin Thr 
500 505 510 

Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe 

515 520 525 

He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 
530 535 540 



Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro 
545 550 555 560 
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Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn lie Leu Gly Gly Trp Val 
565 570 575 

Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 
580 585 590 

Gly Leu Ala Gly Ala Ala lie Gly Ser Val Gly Leu Gly Lys Val Leu 
595 SOO 605 

He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 
610 615 620 

Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 
625 630 635 640 

Asn Leu Leu Pro Ala lie Leu Ser Pro Gly Ala Leu Val Val Gly Val 
645 650 655 

Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
660 665 670 

Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 
675 680 685 

Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
690 695 700 

Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu 
705 710 715 720 

His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp 
725 730 735 

Leu Arg Asp He Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys 
740 745 750 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe 
755 760 765 

Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He 
770 775 780 

Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His Val Lys 
785 790 795 800 

Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
805 810 815 

Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
820 825 830 

Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser Ala Glu 
835 840 845 

Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe His Tyr Val Thr Gly 
850 855 860 



Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser Pro Glu 
865 870 875 880 
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Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro 
885 890 895 

Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His 
900 905 910 

Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val 
915 920 925 

Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His lie Thr Ala Glu 
930 935 940 

Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala Ser 
945 950 955 960 

Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 
965 970 975 

Ala Asn His Asp Ser Pro Asp Ala Glu Leu lie Glu Ala Asn Leu Leu 
980 985 990 

Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val Glu Ser Glu Asn 
995 1000 1005 

Lys Val Val He Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp 
1010 1015 1020 

Glu Arg Glu He Ser Val Pro Ala Glu He Leu Arg Lys Ser Arg Arg 
025 1030 1035 1040 

Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro 
1045 1050 1055 

Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His 
1060 1065 1070 

Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro Pro Arg 
1075 1080 1085 

Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu 
1090 1095 1100 

Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser Gly He 
105 1110 1115 1120 

Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys 
1125 1130 1135 

Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu 
1140 1145 1150 

Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val 
1155 1160 1165 

Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met Ser Tyr 
1170 1175 1180 

Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys 
185 1190 1195 1200 
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Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu 
1205 1210 1215 

Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val 
1220 1225 1230 

Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu 
1235 1240 1245 

Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser 
1250 1255 1260 

Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys 
265 1270 1275 1280 

Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val 
1285 1290 1295 

Thr His lie Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr 
1300 1305 1310 

Pro lie Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin 
1315 1320 1325 

Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp 
1330 1335 1340 

Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr 
345 1350 1355 1360 

Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser 
1365 1370 1375 

Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys 
1380 1385 1390 

Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 
1395 1400 1405 

Thr Glu Ser Asp He Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp 
1410 1415 1420 

Leu Asp Pro Gin Ala Arg Val Ala He Lys Ser Leu Thr Glu Arg Leu 
425 1430 1435 1440 

Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr 

1445 1450 1455 

Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr 
1460 1465 1470 

Leu Thr Cys Tyr He Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu 
1475 1480 1485 

Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys 
1490 1495 1500 

Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr 
505 1510 1515 1520 
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Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro 
1525 1530 1535 

Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val 
1540 1545 1550 

Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro 
1555 1560 1565 

Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro 
1570 1575 1580 

Val Asn Ser Trp Leu Gly Asn lie lie Met Phe Ala Pro Thr Leu Trp 
585 1590 * 1595 1600 

Ala Arg Met He Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg 
1605 1610 1615 

Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He Tyr Gly Ala Cys Tyr 
1620 1625 1630 

Ser He Glu Pro Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly 
1635 1640 1645 

Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg 
1650 1655 1660 

Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp 
665 1670 1675 1680 

Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly 
1685 1690 1695 

Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr 
1700 1705 1710 

Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly Gin Leu Asp Leu Ser 
1715 1720 1725 

Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val 
1730 1735 1740 

Ser His Ala Arg Pro Arg Trp He Trp Phe Cys Leu Leu Leu Leu Ala 
745 1750 1755 1760 

Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg 
1765 1770 



<210> 5 
<211> 4282 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: pCMVII 

<400> 5 

tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 
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cagcttgtct 


gtaagcggat 


gccgggagca 


gacaagcccg 


tcagggcgcg 


tcagcgggtg 


120 


ttggcgggtg 


tcggggctgg 


cttaactatg 


cggcatcaga 


gcagattgta 


ctgagagtgc 


180 


accatatgaa 


gctttttgca 


aaagcctagg 


cctccaaaaa 


agcctcctca 


ctacttctgg 


240 


aatagctcag 


aggccgaggc 


ggcctcggcc 


tctgcataaa 


taaaaaaaat 


tagtcagcca 


300 


tggggcggag 


aatgggcgga 


actgggcggg 


gagggaatta 


ttggctattg 


gccattgcat 


360 


acgttgtatc 


tatatcataa 


tatgtacatt 


tatattggct 


catgtccaat 


atgaccgcca 


420 


tgttgacatt 


gattattgac 


tagttattaa 


tagtaatcaa 


ttacggggtc 


attagttcat 


480 


agcccatata 


tggagttccg 


cgttacataa 


cttacggtaa 


atggcccgcc 


tggctgaccg 


540 


cccaacgacc 


cccgcccatt 


gacgtcaata 


atgacgtatg 


ttcccatagt 


aacgccaata 


600 


gggactttcc 


attgacgtca 


atgggtggag 


tatttacggt 


aaactgccca 


cttggcagta 


660 


catcaagtgt 


atcatatgcc 


aagtccgccc 


cctattgacg 


tcaatgacgg 


taaatggccc 


720 


gcctggcatt 


atgcccagta 


catgacctta 


cgggactttc 


ctacttggca 


gtacatctac 


780 


gtattagtca 


tcgctattac 


catggtgatg 


cggttttggc 


agtacaccaa 


tgggcgtgga 


840 


tagcggtttg 


actcacgggg 


atttccaagt 


ctccacccca 


ttgacgtcaa 


tgggagtttg 


900 


ttttggcacc 


aaaatcaacg 


ggactttcca 


aaatgtcgta 


ataaccccgc 


cccgttgacg 


960 


caaatgggcg 


gtaggcgtgt 


acggtgggag 


gtctatataa 


gcagagctcg 


tttagtgaac 


1020 


cgtcagatcg 


cctggagacg 


ccatccacgc 


tgttttgacc 


tccatagaag 


acaccgggac 


1080 


cgatccagcc 


tccgcggccg 


ggaacggtgc 


attggaacgc 


ggattccccg 


tgccaagagt 


1140 


gacgtaagta 


ccgcctatag 


actctatagg 


cacacccctt 


tggctcttat 


gcatgctata 


1200 


ctgtttttgg 


cttggggcct 


atacaccccc 


gcttccttat 


gctataggtg 
cccctattgg 


atggtatagc 


1260 


ttagcctata 


ggtgtgggtt 


attgaccatt 


attgaccact 


tgacgatact 


1320 


ttccattact 


aatccataac 


atggctcttt 


gccacaacta 


tctctattgg 


ctatatgcca 


1380 


atactctgtc 


cttcagagac 


tgacacggac 


tctgtatttt 


tacaggatgg 


ggtcccattt 


1440 


attatttaca 


aattcacata 


tacaacaacg 


ccgtcccccg 


tgcccgcagt 


ttttattaaa 


1500 


catagcgtgg 


gatctccacg 


cgaatctcgg 


gtacgtgttc 


cggacatggg 


ctcttctccg 


1560 


gtagcggcgg 


agcttccaca 


tccgagccct 


ggtcccatgc 


ctccagcggc 


tcatggtcgc 


1620 


tcggcagctc 


cttgctccta 


acagtggagg 


ccagacttag 


gcacagcaca 


atgcccacca 


1680 


ccaccagtgt 


gccgcacaag 


gccgtggcgg 


tagggtatgt 


gtctgaaaat 


gagctcggag 


1740 


attgggctcg 


caccgctgac 


gcagatggaa 


gacttaaggc 


agcggcagaa 


gaagatgcag 


1800 


gcagctgagt 


tgttgtattc 


tgataagagt 


cagaggtaac 


tcccgttgcg 


gtgctgttaa 


1860 
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cggtggaggg 


cagtgtagtc 


tgagcagtac 


tcgttgctgc 


cgcgcgcgcc 


accagacata 


1920 


atagctgaca 


gactaacaga 


ctgttccttt 


ccatgggtct 


tttctgcagt 


caccgtcgtc 


1980 


gacctaagaa 


ttcagactcg 


agcaagtcta 


gaaaggcgcg 


ccaagatatc 


aaggatccac 


2040 


tacgcgttag 


agctcgctga 


tcagcctcga 


ctgtgccttc 


tagttgccag 


ccatctgttg 


2100 


tttgcccctc 


ccccgtgcct 


tccttgaccc 


tggaaggtgc 


cactcccact 


gtcctttcct 


2160 


aataaaatga 


ggaaattgca 


tcgcattgtc 


tgagtaggtg 


tcattctatt 


ctggggggtg 


2220 


gggtggggca 


ggacagcaag 


ggggaggatt 


gggaagacaa 


tagcaggcat 


gctggggagc 


2280 


tcttccgctt 


cctcgctcac 


tgactcgctg 


cgctcggtcg 


ttcggctgcg 


gcgagcggta 


2340 


tcagctcact 


caaaggcggt 


aatacggtta 


tccacagaat 


caggggataa 


cgcaggaaag 


2400 


aacatgtgag 


caaaaggcca 


gcaaaaggcc 


aggaaccgta 


aaaaggccgc 


gttgctggcg 


2460 


tttttccata 


ggctccgccc 


ccctgacgag 


catcacaaaa 


atcgacgctc 


aagtcagagg 


2520 


tggcgaaacc 


cgacaggact 


ataaagatac 


caggcgtttc 


cccctggaag 


ctccctcgtg 


2580 


cgctctcctg 


ttccgaccct 


gccgcttacc 


ggatacctgt 


ccgcctttct 


cccttcggga 


2640 


agcgtggcgc 


tttctcaatg 


ctcacgctgt 


aggtatctca 


gttcggtgta 


ggtcgttcgc 


2700 


tccaagctgg 


gctgtgtgca 


cgaacccccc 


gttcagcccg 


accgctgcgc 


cttatccggt 


2760 


aactatcgtc 


ttgagtccaa 


cccggtaaga 


cacgacttat 


cgccactggc 


agcagccact 


2820 


ggtaacagga 


ttagcagagc 


gaggtatgta 


ggcggtgcta 


cagagttctt 


gaagtggtgg 


2880 


cctaactacg 


gctacactag 


aaggacagta 


tttggtatct 


gcgctctgct 


gaagccagtt 


2940 


accttcggaa 


aaagagttgg 


tagctcttga 


tccggcaaac 


aaaccaccgc 


tggtagcggt 


3000 


ggtttttttg 


tttgcaagca 


gcagattacg 


cgcagaaaaa 


aaggatctca 


agaagatcct 


3060 


ttgatctttt 


ctacggggtc 


tgacgctcag 


tggaacgaaa 


actcacgtta 


agggattttg 


3120 


gtcatgagat 


tatcaaaaag 


gatcttcacc 


tagatccttt 


taaattaaaa 


atgaagtttt 


3180 


aaatcaatct 


aaagtatata 


tgagtaaact 


tggtctgaca 


gttaccaatg 


cttaatcagt 


3240 


gaggcaccta 


tctcagcgat 


ctgtctattt 


cgttcatcca 


tagttgcctg 


actccccgtc 


3300 


gtgtagataa 


ctacgatacg 


ggagggctta 


ccatctggcc 


ccagtgctgc 


aatgataccg 


3360 


cgagacccac 


gctcaccggc 


tccagattta 


tcagcaataa 


accagccagc 


cggaagggcc 


3420 


gagcgcagaa 


gtggtcctgc 


aactttatcc 


gcctccatcc 


agtctattaa 


ttgttgccgg 


3480 


gaagctagag 


taagtagttc 


gccagttaat 


agtttgcgca 


acgttgttgc 


cattgctaca 


3540 


ggcatcgtgg 


tgtcacgctc 


gtcgtttggt 


atggcttcat 


tcagctccgg 


ttcccaacga 


3600 


tcaaggcgag 


ttacatgatc 


ccccatgttg 


tgcaaaaaag 


cggttagctc 


cttcggtcct 


3660 
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ccgatcgttg 


tcagaagtaa gttggccgca gtgttatcac 


tcatggttat ggcagcactg 


3720 


cataattctc 


ttactgtcat gccatccgta 


agatgctttt 


ctgtgactgg tgagtactca 


3780 


accaagtcat 


tctgagaata gtgtatgcgg 


cgaccgagtt 


gctcttgccc ggcgtcaata 


3840 


c^999^taata 


ccgcgccaca tagcagaact 


ttaaaagtgc 


tcatcattgg aaaacgttct 


3900 


tcggggcgaa 


aactctcaag gatcttaccg ctgttgagat ccagttcgat gtaacccact 


3960 


cgtgcaccca 


actgatcttc agcatctttt 


actttcacca 


gcgtttctgg gtgagcaaaa 


4020 


acaggaaggc 


aaaatgccgc aaaaaaggga 


ataagggcga 


cacggaaatg ttgaatactc 


4080 


atactcttcc 


tttttcaata ttattgaagc 


atttatcagg gttattgtct catgagcgga 


4140 


tacatatttg 


aatgtattta gaaaaataaa 


caaatagggg 


ttccgcgcac atttccccga 


4200 


aaagtgccac 


ctgacgtcta agaaaccatt 


attatcatga 


cattaaccta taaaaatagg 


4260 


cgtatcacga 


ggccctttcg tc 






4282 



<210> 6 

<211> 6299 

<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: pNS34a 

<220> 
<221> CDS 

<222> (1990) . . (4047) 
<400> 6 



cgcgcgtttc 


ggtgatgacg gtgaaaacct 


ctgacacatg 


cagctcccgg agacggtcac 


60 


agcttgtctg 


taagcggatg ccgggagcag 


acaagcccgt 


cagggcgcgt 


cagcgggtgt 


120 


tggcgggtgt 


cggggctggc 


ttaactatgc 


ggcatcagag 


cagattgtac 


tgagagtgca 


180 


ccatatgaag 


ctttttgcaa 


aagcctaggc 


ctccaaaaaa 


gcctcctcac 


tacttctgga 


240 


atagctcaga 


ggccgaggcg 


gcctcggcct 


ctgcataaat 


aaaaaaaatt 


agtcagccat 


300 


ggggcggaga 


atgggcggaa 


ctgggcgggg 


agggaattat 


tggctattgg 


ccattgcata 


360 


cgttgtatct 


atatcataat 


atgtacattt 


atattggctc 


atgtccaata 


tgaccgccat 


420 


gttgacattg 


attattgact 


agttattaat 


agtaatcaat 


tacggggtca 


ttagttcata 


480 


gcccatatat 


ggagttccgc 


gttacataac 


ttacggtaaa 


tggcccgcct ggctgaccgc 


540 


ccaacgaccc 


ccgcccattg acgtcaataa 


tgacgtatgt 


tcccatagta 


acgccaatag 


600 


ggactttcca 


ttgacgtcaa 


tgggtggagt 


atttacggta 


aactgcccac 


ttggcagtac 


660 


atcaagtgta 


tcatatgcca 


agtccgcccc 


ctattgacgt 


caatgacggt 


aaatggcccg 


720 
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cctggcatta 


tgcccagtac 


atgaccttac 


gggactttcc 


tacttggcag 


tacatctacg 


780 


tattagtcat 


cgctattacc 


atggtgatgc 


ggttttggca 


gtacaccaat 


gggcgtggat 


840 


^9cggtttga 


ctcacgggga 


tttccaagtc 


tccaccccat 


tgacgtcaat 


gggagtttgt 


900 


tttggcacca 


aaatcaacgg 


gactttccaa 


aatgtcgtaa 


taaccccgcc 


ccgttgacgc 


960 


aaatgggcgg 


taggcgtgta 


cggtgggagg 


tctatataag 


cagagctcgt 


ttagtgaacc 


1020 


gtcagatcgc 


ctggagacgc 


catccacgct 


gttttgacct 


ccatagaaga 


caccgggacc 


1080 


gatccagcct 


ccgcggccgg 


gaacggtgca 


ttggaacgcg 


gattccccgt 


gccaagagtg 


1140 


acgtaagtac 


cgcctataga 


ctctataggc 


acaccccttt 


ggctcttatg 


catgctatac 


1200 


tgtttttggc 


ttggggccta 


tacacccccg 


ctccttatgc 


tataggtgat 


ggtatagctt 


1260 


agcctatagg 


tgtgggttat 


tgaccattat 


tgaccactcc 


cctattggtg 


acgatacttt 


1320 


ccattactaa 


tccataacat 


ggctctttgc 


cacaactatc 


tctattggct 


atatgccaat 


1380 


actctgtcct 


tcagagactg 


acacggactc 


tgtattttta 


caggatgggg 


tccatttatt 


1440 


atttacaaat 


tcacatatac 


aacaacgccg 


tcccccgtgc 


ccgcagtttt 


tattaaacat 


1500 


agcgtgggat 


ctccgacatc 


tcgggtacgt 


gttccggaca 


tgggctcttc 


tccggtagcg 


1560 


gcggagcttc 


cacatccgag 


ccctggtccc 


atccgtccag 


cggctcatgg 


tcgctcggca 


1620 


gctccttgct 


cctaacagtg 


gaggccagac 


ttaggcacag 


cacaatgccc 


accaccacca 


1680 


gtgtgccgca 


caaggccgtg 


gcggtagggt 


atgtgtctga 


aaatgagctc 


ggagattggg 


1740 


ctcgcacctg 


gacgcagatg 


gaagacttaa 


ggcagcggca 


gaagaagatg 


caggcagctg 


1600 


agttgttgta 


ttctgataag 


agtcagaggt 


aactcccgtt 


gcggtgctgt 


taacggtgga 


1860 


gggcagtgta 


gtctgagcag 


tactcgttgc 


tgccgcgcgc 


gccaccagac 


ataatagctg 


1920 


acagactaac 


agactgttcc 


tttccatggg 


tcttttctgc 


agtcaccgtc 


gtcgacctaa 


1980 


gaattcacc atg gcg ccc 
Met Ala Pro 


ate acg gcg tac gcc cag cag aca agg ggc etc 
He Thr Ala Tyx Ala Gin Gin Thr Arg Gly Leu 


2031 



15 10 

eta ggg tgc ata ate ace age eta act ggc egg gae aaa aac caa gtg 2079 
Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 
15 20 25 30 

gag ggt gag gtc cag att gtg tea act get gcc caa ace ttc ctg gea 2127 
Glu Gly Glu Val Gin He Val Ser Thr Ala Ala Gin Thr Phe Leu Ala 
35 40 45 

aeg tgc ate aat ggg gtg tgc tgg act gtc tac eac ggg gee gga acg 2175 
Thr Cys He Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Thr 
50 55 60 

agg ace ate gcg tea ccc aag ggt cet gtc ate cag atg tat ace aat 2223 
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Arg Thr He Ala Ser Pro Lys Gly Pro Val He Gin Met Tyr Thr Asn 

65 70 75 

gta gac caa gac ctt gtg ggc tgg ccc get teg caa ggt acc egc tea 2271 

Val Asp Gin Asp Leu Val Gly Trp Pro Ala Ser Gin Gly Thr Arg Ser 
80 65 90 

ttg aca ccc tgc act tgc ggc tec teg gac ctt tac ctg gtc acg agg 2319 

Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg 

95 100 105 110 

cac gcc gat gtc att ccc gtg egc egg egg ggt gat age agg ggc age 2367 

His Ala Asp Val He Pro Val Arg TVrg Arg Gly Asp Ser Arg Gly Ser 

115 120 125 

ctg ctg teg ccc egg ccc att tec tac ttg aaa ggc tec teg ggg ggt 2415 

Leu Leu Ser Pro Arg Pro He Ser Tyr Leu Lys Gly Ser Ser Gly Gly 

130 135 140 

cog ctg ttg tgc ccc gcg ggg cac gcc gtg ggc ata ttt agg gcc gcg 2463 

Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly He Phe Arg Ala Ala 

145 150 155 

gtg tgc acc egt gga gtg get aag gcg gtg gac ttt ate cet gtg gag 2511 

Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe He Pro Val Glu 
160 165 170 

aae eta gag aca acc atg agg tec ccg gtg ttc acg gat aac tec tct 2559 

Asn Leu Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser 

175 180 185 190 

cca cca gta gtg ccc cag age ttc cag gtg get cac etc cat get ccc 2607 

Pro Pro Val Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro 

195 200 205 

aca ggc age ggc aaa age acc aag gtc ccg get gca tat gea get cag 2655 

Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin 

210 215 220 

ggc tat aag gtg eta gta etc aac ccc tct gtt get gca aca ctg ggc 2703 

Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly 

225 230 235 

ttt ggt get tac atg tec aag get cat ggg ate gat cet aac ate agg 2751 

Phe Gly Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg 
240 245 250 

ace ggg gtg aga aca att acc act ggc age ccc ate acg tac tec acc 2799 

Thr Gly Val Arg Thr He Thr Thr Gly Ser Pro He Thr Tyr Ser Thr 

255 260 265 270 

tac ggc aag ttc ctt gcc gac ggc ggg tgc teg ggg ggc get tat gac 2847 

Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp 

275 280 285 

ata ata att tgt gac gag tgc cac tec acg gat gcc aca tec ate ttg 2895 

He He He Cys Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu 

290 295 300 
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ggc att ggc act gtc ctt gac caa gca gag act gcg ggg gcg aga ctg 2943 
Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu 
305 310 315 

gtt gtg etc gcc acc gcc acc cct ccg ggc tec gtc act gtg ccc cat 2991 
Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His 
320 325 330 

ccc aac ate gag gag gtt get ctg tec acc acc gga gag ate cct ttt 3039 
Pro Asn He Glu Glu Val Ala Leu Ser Thr Thr Gly Glu He Pro Phe 
335 340 345 350 

tac ggc aag get ate ccc etc gaa gta ate aag ggg ggg aga cat etc 3087 
Tyr Gly Lys Ala He Pro Leu Glu Val He Lys Gly Gly Arg His Leu 
355 360 365 

ate ttc tgt cat tea aag aag aag tge gac gaa etc gee gca aag ctg 3135 
He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu 
370 375 380 

gtc gca ttg ggc ate aat gcc gtg gee tac tac cgc ggt ctt gac gtg 3183 
Val Ala Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val 
385 390 395 

tec gtc ate ccg ace age ggc gat gtt gtc gtc gtg gca acc gat gee 3231 
Ser Val He Pro Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala 
400 405 410 

etc atg acc ggc tat acc ggc gac ttc gac teg gtg ata gac tge aat 3279 
Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser val He Asp Cys Asn 
415 420 425 430 

acg tgt gtc ace cag aea gtc gat ttc age ctt gac cct acc ttc acc 3327 
Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr 
435 440 445 

att gag aca ate acg etc ccc caa gat get gtc tec cgc act caa cgt 3375 
He Glu Thr He Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg 
450 455 460 

egg ggc agg act ggc agg ggg aag cca ggc ate tac aga ttt gtg gca 3423 
Arg Gly Arg Thr Gly Arg Gly Lys Pro Gly He Tyr Arg Phe Val Ala 
465 470 475 

ccg ggg gag cgc ccc tec ggc atg ttc gac teg tec gtc etc tgt gag 3471 
Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
480 485 490 

tge tat gac gca ggc tgt get tgg tat gag etc acg ccc gcc gag act 3519 
Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr 
495 500 505 510 

aca gtt agg eta ega gcg tac atg aac acc ccg ggg ctt ccc gtg tge 3567 
Thr Val Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys 
515 520 525 

cag gac cat ctt gaa ttt tgg gag ggc gtc ttt aca ggc etc act eat 3615 
Gin Asp His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His 
530 535 540 
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ata gat gcc cac ttt eta tec cag aca aag cag agt ggg gag aac ctt 3663 
lie Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu 
545 550 555 

cat tac ctg gta gcg tac caa gcc acc gtg tgc get agg get caa gcc 3711 
Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala 
560 565 570 

cct ccc cca teg tgg gac cag atg tgg aag tgt ttg att cgc etc aag 3759 
Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys 
575 580 585 590 

ccc acc etc cat ggg cca aca ccc ctg eta tac aga ctg ggc get gtt 3807 
Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val 
595 600 605 

cag aat gaa ate acc ctg acg cac cca gtc acc aaa tac ate atg aca 3855 
Gin Asn Glu He Thr Leu Thr His Pro Val Thr Lys Tyr He Met Thr 
610 615 620 

tgc atg teg gcc gac ctg gag gtc gtc acg age ace tgg gtg etc gtt 3903 
Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val 
625 630 635 

ggc ggc gtc ctg get get ttg gee gcg tat tgc ctg tea aca ggc tgc 3951 
Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys 
640 645 650 

gtg gtc ata gtg ggc agg gtc gtc ttg tec ggg aag ccg gca ate ata 3999 
Val Val He Val Gly Arg Val Val Leu Ser Gly Lys Pro Ala He He 
655 660 665 670 

cct gac agg gaa gtc etc tac cga gag ttc gat gag atg gaa gag tgc 4047 
Pro Asp Arg Glu Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys 





675 




680 




685 




taggatccac 


tacgcgttag 


agctcgctga 


tcagcctcga 


ctgtgccttc 


tagttgccag 


4107 


ccatctgttg 


tttgcccctc 


ccccgtgcct 


tccttgaccc 


tggaaggtgc 


cactcccact 


4167 


gtcctttcct 


aataaaatga 


ggaaattgca 


tcgcattgtc 


tgagtaggtg 


tcattctatt 


4227 


ctggggggtg 


gggtggggca 


ggacagcaag 


ggggaggatt 


gggaagacaa 


tagcaggcat 


4287 


getggggagc 


tcttccgctt 


cetcgetcac 


tgactegctg 


cgctcggteg 


ttcggctgcg 


4347 


gcgagcggta 


teagctcact 


caaaggeggt 


aatacggtta 


tecacagaat 


caggggataa 


4407 


cgcaggaaag 


aacatgtgag 


caaaaggeca 


gcaaaaggcc 


aggaacegta 


aaaaggccgc 


4467 


gttgctggcg 


tttttccata 


ggctccgccc 


ccctgacgag 


catcacaaaa 


atcgacgctc 


4527 


aagtcagagg 


tggcgaaacc 


cgacaggact 


ataaagatac 


caggcgtttc 


cccctggaag 


4587 


ctccctcgtg 


cgctctcctg 


ttccgacect 


gccgcttaec 


ggatacctgt 


ccgcctttct 


4647 


cccttcggga 


agcgtggegc 


tttctcaatg 


ctcaegctgt 


aggtatctea 


gtteggtgta 


4707 


ggtcgttcgc 


tccaagctgg 


gctgtgtgca 


egaacccccc 


gttcagcccg 


accgctgcgc 


4767 
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cttatccggt 


aactatcgtc 


ttgagtccaa 


cccggtaaga cacgacttat 


cgccactggc 


4827 


agcagccact 


ggtaacagga 


ttagcagagc 


gaggtatgta ggcggtgcta 


cagagttctt 


4887 


gaagtggtgg 


cctaactacg 


gctacactag 


aaggacagta 


tttggtatct 


gcgctctgct 


4947 


gaagccagtt 


accttcggaa 


aaagagttgg 


tagctcttga 


tccggcaaac 


aaaccaccgc 


5007 


tggtagcggt 


ggtttttttg 


tttgcaagca 


gcagattacg cgcagaaaaa 


aaggatctca 


5067 


agaagatcct 


ttgatctttt 


ctacggggtc 


tgacgctcag tggaacgaaa 


actcacgtta 


5127 


agggattttg 


gtcatgagat 


tatcaaaaag 


gatcttcacc 


tagatccttt 


taaattaaaa 


5187 


atgaagtttt 


aaatcaatct 


aaagtatata 


tgagtaaact 


tggtctgaca gttaccaatg 


5247 


cttaatcagt 


gaggcaccta 


tctcagcgat 


ctgtctattt 


cgttcatcca 


UcL^ I. l-^ 


5307 


actccccgtc 


gtgtagataa 


ctacgatacg 


ggagggctta 


ccatctggcc 


ccagtgctgc 


5367 


aatgataccg 


cgagacccac 


gctcaccggc 


tccagattta 


tcagcaataa 


accagccagc 


5427 


cggaagggcc 


gagcgcagaa 


gtggtcctgc 


aactttatcc 


gcctccatcc 


agtctattaa 


5487 


ttgttgccgg 


gaagctagag 


taagtagttc 


gccagttaat 


agtttgcgca 


acgttgttgc 


5547 


cattgctaca 


ggcatcgtgg 


tgtcacgctc 


gtcgtttggt 


atggcttcat 


tcagctccgg 


5607 


ttcccaacga 


tcaaggcgag 


ttacatgatc 


ccccatgttg tgcaaaaaag 


cggttagctc 


5667 


cttcggtcct 


ccgatcgttg 


tcagaagtaa 


gttggccgca 


gtgttatcac 


tcatggttat 


5727 


ggcagcactg 


cataattctc 


ttactgtcat 


gccatccgta 


agatgctttt 


ctgtgactgg 


5787 


tgagtactca 


accaagtcat 


tctgagaata 


gtgtatgcgg 


cgaccgagtt 


gctcttgccc 


5847 


ggcgtcaata 


cgggataata 


ccgcgccaca 


tagcagaact 


ttaaaagtgc 


tcatcattgg 


5907 


aaaacgttct 


tcggggcgaa 


aactctcaag 


gatcttaccg ctgttgagat 


ccagttcgat 


5967 


gtaacccact 


cgtgcaccca 


actgatcttc 


agcatctttt 


actttcacca gcgtttctgg 


6027 


gtgagcaaaa 


acaggaaggc 


aaaatgccgc 


aaaaaaggga ataagggcga 


cacggaaatg 


6087 


ttgaatactc 


atactcttcc 


tttttcaata 


ttattgaagc 


atttatcagg gttattgtct 


6147 


catgagcgga 


tacatatttg 


aatgtattta 


gaaaaataaa 


caaatagggg 


ttccgcgcac 


6207 


atttccccga 


aaagtgccac 


ctgacgtcta 


agaaaccatt attatcatga 


cattaaccta 


6267 


taaaaatagg 


cgtatcacga 


ggccctttcg 


tc 






6299 



<210> 7 
<211> 686 
<212> PRT 

<213> Artificial Sequence 
<220> 
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<223> Description of Artificial Sequence: pNS34a 
<400> 7 

Met Ala Pro lie Thr Ala Tyr Ala Gin Gin Thr Arg Gly Leu Leu Gly 
15 10 15 

Cys lie lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly 
20 25 30 

Glu Val Gin lie Val Ser Thr Ala Ala Gin Thr Phe Leu Ala Thr Cys 
35 40 45 

lie Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Thr Arg Thr 
50 55 60 



lie Ala Ser Pro Lys Gly Pro Val lie Gin Met Tyr Thr Asn Val Asp 
65 70 75 80 

Gin Asp Leu Val Gly Trp Pro Ala Ser Gin Gly Thr Arg Ser Leu Thr 
85 90 95 

Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala 
100 105 110 

Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu 
115 120 125 



Ser Pro 
130 

Leu Cys 
145 



Arg Pro lie Ser 



Pro Ala 



Thr Arg Gly Val 



Gly His 
150 

Ala Lys 
165 



Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu 
135 140 

Ala Val Gly lie Phe Arg Ala Ala Val Cys 
155 160 

Ala Val Asp Phe He Pro Val Glu Asn Leu 
170 175 



Glu Thr 



Val Val 



Ser Gly 
210 

Lys Val 
225 



Thr Met 
180 

Pro Gin 
195 



Lys Ser Thr Lys 



Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro 
185 190 



Leu Val 



Ala Tyr Met Ser 
Val Arg 
Lys Phe 



Thr He 
260 



Leu Ala 
275 



He Cys 
290 



Asp Gly 
Asp Glu Cys His 



Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly 
200 205 

Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr 
215 220 

Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly 
230 235 240 

lie Asp Pro Asn He Arg Thr Gly 
250 255 

Pro He Thr Tyr Ser Thr Tyr Gly 
265 270 

Gly Cys Ser Gly Gly Ala Tyr Asp He He 
280 285 

Ser Thr Asp Ala Thr Ser He Leu Gly He 
295 300 



Lys Ala His Gly 
245 

Thr Thr Gly Ser 
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Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val 
305 310 315 320 

Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn 
325 330 335 

He Glu Glu Val Ala Leu Ser Thr Thr Gly Glu He Pro Phe Tyr Gly 
340 345 350 

Lys Ala He Pro Leu Glu Val He Lys Gly Gly Arg His Leu He Phe 
355 360 365 

Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala 
370 375 380 

Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val 
385 390 395 400 

He Pro Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met 
405 410 415 

Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys 
420 425 430 

Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu 
435 440 445 

Thr He Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly 
450 455 460 

Arg Thr Gly Arg Gly Lys Pro Gly He Tyr Arg Phe Val Ala Pro Gly 
465 470 475 480 

Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr 
485 490 495 

Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val 
500 505 510 

Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp 
515 520 525 

His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His He Asp 
530 535 540 

Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr 
545 550 555 560 

Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro 
565 570 575 

Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr 
580 585 590 

Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn 
595 600 605 



Glu He Thr Leu Thr His Pro Val Thr Lys Tyr He Met Thr Cys Met 
610 615 620 
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Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly 
625 630 635 640 

Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val 
645 650 655 

lie Val Gly Arg Val Val Leu Ser Gly Lys Pro Ala He He Pro Asp 
660 665 670 

Arg Glu Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys 
675 680 685 

<210> 8 
<211> 19912 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: pd.deltaNS3NS5 

<220> 
<221> CDS 

<222> (12745) . . (18057) 
<400> 8 



atcgatccta 


ccccttgcgc 


taaagaagta 


tatgtgccta 


ctaacgcttg 


tctttgtctc 


60 


tgtcactaaa 


cactggatta 


ttactcccag 


atacttattt 


tggactaatt 


taaatgattt 


120 


cggatcaacg 


ttcttaatat 


cgctgaatct 


tccacaattg 


atgaaagtag 


ctaggaagag 


180 


gaattggtat 


aaagtttttg 


tttttgtaaa 


tctcgaagta 


tactcaaacg 


aatttagtat 


240 


tttctcagtg 


atctcccaga 


tgctttcacc 


ctcacttaga 


agtgctttaa 


gcattttttt 


300 


actgtggcta 


tttcccttat 


ctgcttcttc 


cgatgattcg 


aactgtaatt 


gcaaactact 


360 


tacaatatca 


gtgatatcag 


attgatgttt 


ttgtccatag 


taaggaataa 


ttgtaaattc 


420 


ccaagcagga 


atcaatttct 


ttaatgaggc 


ttccagaatt 


gttgcttttt 


gcgtcttgta 


480 


tttaaactgg 


agtgatttat 


tgacaatatc 


gaaactcagc 


gaattgctta 


tgatagtatt 


540 


atagctcatg 


aatgtggctc 


tcttgattgc 


tgttccgtta 


tgtgtaatca 


tccaacataa 


600 


ataggttagt 


tcagcagcac 


ataatgctat 


tttctcacct 


gaaggtcttt 


caaacctttc 


660 


cacaaactga 


cgaacaagca 


ccttaggtgg 


tgttttacat 


aatatatcaa 


attgtggcat 


720 


gcttagcgcc 


gatcttgtgt 


gcaattgata 


tctagtttca 


actactctat 


ttatcttgta 


780 


tcttgcagta 


ttcaaacacg 


ctaactcgaa 


aaactaactt 


taattgtcct 


gtttgtctcg 


840 


cgttctttcg 


aaaaatgcac 


cggccgcgca 


ttatttgtac 


tgcgaaaata 


attggtactg 


900 


cggtatcttc 


atttcatatt 


ttaaaaatgc 


acctttgctg 


cttttcctta 


atttttagac 


960 


ggcccgcagg 


ttcgttttgc 


ggtactatct 


tgtgataaaa 


agttgttttg 


acatgtgatc 


1020 
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tgcacagatt 


ttataatgta 


ataagcaaga 


atacattatc 


aaacgaacaa 


tactggtaaa 


1080 


agaaaaccaa 


aatggacgac 


attgaaacag 


ccaagaatct 


gacggtaaaa 


gcacgtacag 


1140 


cttatagcgt 


ctgggatgta 


tgtcggctgt 


ttattgaaat 


gattgctcct 


gatgtagata 


1200 


ttgatataga 


gagtaaacgt 


aagtctgatg 


agctactctt 


tccaggatat 


gtcataaggc 


1260 


ccatggaatc 


tctcacaacc 


ggtaggccgt 


atggtcttga 


ttctagcgca 


gaagattcca 


1320 


gcgtatcttc 


tgactccagt 


gctgaggtaa 


ttttgcctgc 


tgcgaagatg 


gttaaggaaa 


1380 


ggtttgattc 


gattggaaat 


ggtatgctct 


cttcacaaga 


agcaagtcag 


gctgccatag 


1440 


atttgatgct 


acagaataac 


aagctgttag 


acaatagaaa 


gcaactatac 


aaatctattg 


1500 


ctataataat 


aggaagattg 


cccgagaaag 


acaagaagag 


agctaccgaa 


atgctcatga 


1560 


gaaaaatgga 


ttgtacacag 


ttattagtcc 


caccagctcc 


aacggaagaa 


gatgttatga 


1620 


agctcgtaag 


cgtcgttacc 


caattgctta 


ctttagttcc 


accagatcgt 


caagctgctt 


1680 


taataggtga 


tttattcatc 


ccggaatctc 


taaaggatat 


attcaatagt 


ttcaatgaac 


1740 


tggcggcaga 


gaatcgttta 


cagcaaaaaa 


agagtgagtt 


ggaaggaagg 


actgaagtga 


1800 


accatgctaa 


tacaaatgaa 


gaagttccct 


ccaggcgaac 


aagaagtaga 


gacacaaatg 


1860 


caagaggagc 


atataaatta 


caaaacacca 


tcactgaggg 


ccctaaagcg 


gttcccacga 


1920 


aaaaaaggag 


agtagcaacg 


agggtaaggg 


gcagaaaatc 


acgtaatact 


tctagggtat 


1980 


gatccaatat 


caaaggaaat 


gatagcattg 


aaggatgaga 


ctaatccaat 


tgaggagtgg 


2040 


cagcatatag 


aacagctaaa 


gggtagtgct 


gaaggaagca 


tacgataccc 


cgcatggaat 


2100 


gggataatat 


cacaggaggt 


actagactac 


ctttcatcct 


acataaatag 


acgcatataa 


2160 


gtacgcattt 


aagcataaac 


acgcactatg 


ccgttcttct 


catgtatata 


tatatacagg 


2220 


caacacgcag 


atataggtgc 


gacgtgaaca 


gtgagctgta 


tgtgcgcagc 


tcgcgttgca 


2280 


ttttcggaag 


cgctcgtttt 


cggaaacgct 


ttgaagttcc 


tattccgaag 


ttcctattct 


2340 


ctagaaagta 


taggaacttc 


agagcgcttt 


tgaaaaccaa 


aagcgctctg 


aagacgcact 


2400 


ttcaaaaaac 


caaaaacgca 


ccggactgta 


acgagctact 


aaaatattgc 


gaataccgct 


2460 


tccacaaaca 


ttgctcaaaa 


gtatctcttt 


gctatatatc 


tctgtgctat 


atccctatat 


2520 


aacctaccca 


tccacctttc 


gctccttgaa 


cttgcatcta 


aactcgacct 


ctacatcaac 


2580 


aggcttccaa 


tgctcttcaa 


attttactgt 


caagtagacc 


catacggctg 


taatatgctg 


2640 


ctcttcataa 


tgtaagctta 


tctttatcga 


atcgtgtgaa 


aaactactac 


cgcgataaac 


2700 


ctttacggtt 


ccctgagatt 


gaattagttc 


ctttagtata 


tgatacaaga 


cacttttgaa 


2760 


ctttgtacga 


cgaattttga 


ggttcgccat 


cctctggcta 


tttccaatta 


tcctgtcggc 


2820 



44 



wo 01/38360 



PCTAJSOO/32326 



gaaacatgct 


gcttaaaact 


ccaagcggta 


ggagaccgat 


aaaggttaat 


aggacagccg 


2880 


tattatctcc 


gcctcagttt 


gatcttccgc 


ttcagactgc 


catttttcac 


ataatgaatc 


2940 


tatttcaccc 


cacaatcctt 


catccgcctc 


cgcatcttgt 


tccgttaaac 


tattgacttc 


3000 


atgttgtaca 


ttgtttagtt 


cacgagaagg 


gtcctcttca 


ggcggtagct 


cctgatctcc 


3060 


tatatgacct 


ttatcctgtt 


ctctttccac 


aaacttagaa 


atgtattcat 


gaattatgga 


3120 


gcacctaata 


acattcttca 


aggcggagaa 


gtttgggcca 


gatgcccaat 


atgcttgaca 


3180 


tgaaaacgtg 


agaatgaatt 


tagtattatt 


gtgatattct 


gaggcaattt 


tattataatc 


3240 


tcgaagataa 


gagaagaatg 


cagtgacctt 


tgtattgaca 


aatggagatt 


ccatgtatct 


3300 


aaaaaatacg 


cctttaggcc 


ttctgatacc 


ctttcccctg 


cggtttagcg 


tgccttttac 


3360 


attaatatct 


aaaccctctc 


cgatggtggc 


ctttaactga 


ctaataaatg 


caaccgatat 


3420 


aaactgtgat 


aattctgggt 


gatttatgat 


tcgatcgaca 


attgtattgt 


acactagtgc 


3480 


aggatcaggc 


caatccagtt 


ctttttcaat 


taccggtgtg 


tcgtctgtat 


tcagtacatg 


3540 


tccaacaaat 


gcaaatgcta 


acgttttgta 


tttcttataa 


ttgtcaggaa 


ctggaaaagt 


3600 


cccccttgtc 


gtctcgatta 


cacacctact 


ttcatcgtac 


accataggtt 


ggaagtgctg 


3660 


cataatacat 


tgcttaatac 


aagcaagcag 


tctctcgcca 


ttcatatttc 


agttattttc 


3720 


cattacagct 


gatgtcattg 


tatatcagcg 


ctgtaaaaat 


ctatctgtta 


cagaaggttt 


3780 


tcgcggtttt 


tataaacaaa 


actttcgtta 


cgaaatcgag 


caatcacccc 


agctgcgtat 


3840 


ttggaaattc 


gggaaaaagt 


agagcaacgc 


gagttgcatt 


ttttacacca 


taatgcatga 


3900 


ttaacttcga 


gaagggatta 


aggctaattt 


cactagtatg 


tttcaaaaac 


ctcaatctgt 


3960 


ccattgaatg 


ccttataaaa 


cagctataga 


ttgcatagaa 


gagttagcta 


ctcaatgctt 


4020 


tttgtcaaag 


cttactgatg 


atgatgtgtc 


tactttcagg 


cgggtctgta 


gtaaggagaa 


4080 


tgacattata 


aagctggcac 


ttagaattcc 


acggactata 


gactatacta 


gtatactccg 


4140 


tctactgtac 


gatacacttc 


cgctcaggtc 


cttgtccttt 


aacgaggcct 


taccactctt 


4200 


ttgttactct 


attgatccag 


ctcagcaaag 


gcagtgtgat 


ctaagattct 


atcttcgcga 


4260 


tgtagtaaaa 


ctagctagac 


cgagaaagag 


actagaaatg 


caaaaggcac 


ttctacaatg 


4320 


gctgccatca 


ttattatccg 


atgtgacgct 


gcattttttt 


tttttttttt 


tttttttttt 


4380 


tttttttttt 


tttttttttt 


ttttttggta 


caaatatcat 


aaaaaaagag 


aatcttttta 


4440 


agcaaggatt 


ttcttaactt 


cttcggcgac 


agcatcaccg 


acttcggtgg 


tactgttgga 


4500 


accacctaaa 


tcaccagttc 


tgatacctgc 


atccaaaacc 


tttttaactg 


catcttcaat 


4560 


ggctttacct 


tcttcaggca 


agttcaatga 


caatttcaac 


atcattgcag 


cagacaagat 


4620 
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agtggcgata 


gggttgacct 


tattctttgg 


caaatctgga 


gcggaaccat 


ggcatggttc 


4680 


gtacaaacca 


aatgcggtgt 


tcttgtctgg 


caaagaggcc 


aaggacgcag 


atggcaacaa 


4740 


acccaaggag 


cctgggataa 


cggaggcttc 


^tcggagatg 


atatcaccaa 


acatgttgct 


4800 


ggtgattata 


ataccattta 


ggtgggttgg 


gttcttaact 


aggatcatgg 


cggcagaatc 


4860 


aatcaattga 


tgttgaactt 


tcaatgtagg 


gaattcgttc 


ttgatggttt 


cctccacagt 


4920 


ttttctccat 


aatcttgaag 


aggccaaaac 


attagcttta 


tccaaggacc 


aaataggcaa 


4980 


tggtggctca 


tgttgtaggg 


ccatgaaagc 


ggccattctt 


gtgattcttt 


gcacttctgg 


5040 


aacggtgtat 


tgttcactat 


cccaagcgac 


accatcacca 


tcgtcttcct 


ttctcttacc 


5100 


aaagtaaata 


cctcccacta 


attctctaac 


aacaacgaag 


tcagtacctt 


tagcaaattg 


5160 


tggcttgatt 


ggagataagt 


ctaaaagaga 


gtcggatgca 


aagttacatg 


gtcttaagtt 


5220 


ggcgtacaat 


tgaagttctt 


tacggatttt 


tagtaaacct 


tgttcaggtc 


taacactacc 


5280 


ggtaccccat 


ttaggaccac 


ccacagcacc 


taacaaaacg 


gcatcagcct 


tcttggaggc 


5340 


ttccagcgcc 


tcatctggaa 


gtggaacacc 


tgtagcatcg 


atagcagcac 


caccaattaa 


5400 


atgattttcg 


aaatcgaact 


tgacattgga 


acgaacatca 


gaaatagctt 


taagaacctt 


5460 


aatggcttcg 


gctgtgattt 


cttgaccaac 


gtggtcacct 


ggcaaaacga 


cgatcttctt 


5520 


aggggcagac 


attacaatgg 


tatatccttg 


aaatatatat 


aaaaaaaaaa 


aaaaaaaaaa 


5580 


aaaaaaaaaa 


atgcagcttc 


tcaatgatat 


tcgaatacgc 


tttgaggaga 


tacagcctaa 


5640 


tatccgacaa 


actgttttac 


agatttacga 


tcgtacttgt 


tacccatcat 


tgaattttga 


5700 


acatccgaac 


ctgggagttt 


tccctgaaac 


agatagtata 


tttgaacctg 


tataataata 


5760 


tatagtctag 


cgctttacgg 


aagacaatgt 


atgtatttcg 


gttcctggag 


aaactattgc 


5820 


atctattgca 


taggtaatct 


tgcacgtcgc 


atccccggtt 


cattttctgc 


gtttccatct 


5880 


tgcacttcaa 


tagcatatct 


ttgttaacga 


agcatctgtg 


cttcattttg 


tagaacaaaa 


5940 


atgcaacgcg 


agagcgctaa 


tttttcaaac 


aaagaatctg 


agctgcattt 


ttacagaaca 


6000 


gaaatgcaac 


gcgaaagcgc 


tattttacca 


acgaagaatc 


tgtgcttcat 


ttttgtaaaa 


6060 


caaaaatgca 


acgcgagagc 


gctaattttt 


caaacaaaga 


ate t gage tg 


catttttaca 


6120 


gaacagaaat 


gcaacgcgag 


agcgctattt 


taccaacaaa 


gaatctatac 


ttcttttttg 


6180 


ttctacaaaa 


atgcatcccg 


agagcgctat 


ttttctaaca 


aagcatctta 


gattactttt 


6240 


tttctccttt 


gtgcgctcta 


taatgcagtc 


tcttgataac 


tttttgcact 


gtaggtccgt 


6300 


taaggttaga 


agaaggctac 


tttggtgtct 


attttctctt 


ccataaaaaa 


agcctgactc 


6360 


cacttcccgc 


gtttactgat 


tactagcgaa 


gctgcgggtg 


cattttttca 


agataaaggc 


6420 
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atccccgatt 


atattctata 


ccgatgtgga 


gcgttgatga 


ttcttcattg 


gtcagaaaat 


atactacgta 


taggaaatgt 


ttacattttc 


tcttactaca 


atttttttgt 


ctaaagagta 


gtcgagttta 


gatgcaagtt 


caaggagcga 


agcacagaga 


tatatagcaa 


agagatactt 


aatattttag 


tagctcgtta 


cagtccggtg 


gagcgctttt 


ggttttcaaa 


agcgctctga 


tcggaatagg 


aacttcaaag 


cgtttccgaa 


tgcgcacata 


cagctcactg 


ttcacgtcgc 


tatacatgag 


aagaacggca 


tagtgcgtgt 


atttatgtag 


gatgaaaggt 


agtctagtac 


gtatcgtatg 


cttccttcag 


cactaccctt 


tggattagtc 


tcatccttca 


atgctatcat 


ccgagaaact 


agtgcgaagt 


agtgatcagg 


cctggccacg 


gcagaagcac 


gcttatcgct 


taggcccttc 


attgaaagaa 


atgaggtcat 


attttttata 


gcaaagattg 


aataaggcgc 


gactaagtta 


tcttttaata 


attggtattc 


atttactcgt 


tttaggactg 


gttcagaatt 


atcgatgata 


agctgtcaaa 


catgagaatt 


tatttttata 


ggttaatgtc 


atgataataa 


ggggaaatgt 


gcgcggaacc 


cctatttgtt 


cgctcatgag 


acaataaccc 


tgataaatgc 


gtattcaaca 


tttccgtgtc 


gcccttattc 


ttgctcaccc 


agaaacgctg 


gtgaaagtaa 


tgggttacat 


cgaactggat 


ctcaacagcg 


aacgttttcc 


aatgatgagc 


acttttaaag 


ttgacgccgg 


gcaagagcaa 


ctcggtcgcc 


agtactcacc 


agtcacagaa 


aagcatctta 



ttgcgcatac 


tttgtgaaca 


gaaagtgata 


6480 


tatgaacggt 


ttcttctatt 


ttgtctctat 


6540 


gtattgtttt 


cgattcactc 


tatgaatagt 


6600 


atactagaga 


taaacataaa 


aaatgtagag 


6660 


aaggtggatg 


ggtaggttat 


atagggatat 


6720 


ttgagcaatg 


tttgtggaag 


cggtattcgc 


6780 


cgtttttggt 


tttttgaaag 


tgcgtcttca 


6840 


agttcctata 


ctttctagag 


aataggaact 


6900 


aacgagcgct 


tccgaaaatg 


caacgcgagc 


6960 


acctatatct 


gcgtgttgcc 


tgtatatata 


7020 


ttatgcttaa 


atgcgtactt 


atatgcgtct 


7080 


ctcctgtgat 


attatcccat 


tccatgcggg 


7140 


tagctgttct 


atatgctgcc 


actcctcaat 


7200 


ttcctttgat 


attggatcat 


atgcatagta 


7260 


tattgctgtt 


atctgatgag 


tatacgttgt 


7320 


ccaatttccc 


acaacattag 


tcaactccgt 


7380 


caaatgtctt 


ccaatgtgag 


attttgggcc 


7440 


atttttcttc 


aaagctttat 


tgtacgatct 


7500 


ctgtttattg 


cttgaagaat 


tgccggtcct 


7560 


cctcaaaaat 


tcatccaaat 


atacaagtgg 


7620 


cttgaagacg 


aaagggcctc 


gtgatacgcc 


7680 


tggtttctta 


gacgtcaggt 


ggcacttttc 


7740 


tatttttcta 


aatacattca 


aatatgtatc 


7800 


ttcaataata 


ttgaaaaagg 


aagagtatga 


7860 


ccttttttgc 


ggcattttgc 


cttcctgttt 


7920 


aagatgctga 


agatcagttg 


ggtgcacgag 


7980 


gtaagatcct 


tgagagtttt 


cgccccgaag 


8040 


ttctgctatg 


tggcgcggta 


ttatcccgtg 


8100 


gcatacacta 


ttctcagaat 


gacttggttg 


8160 


cggatggcat 


gacagtaaga 


gaattatgca 


8220 
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gtgctgccat 


aaccatgagt 


gataacactg 


cggccaactt 


acttctgaca 


acgatcggag 


8280 


gaccgaagga 


gctaaccgct 


tttttgcaca 


acatggggga 


tcatgtaact 


cgccttgatc 


8340 


gttgggaacc 


ggagctgaat 


gaagccatac 


caaacgacga 


gcgtgacacc 


acgatgcctg 


8400 


cagcaatggc 


aacaacgttg 


cgcaaactat 


taactggcga 


actacttact 


ctagcttccc 


8460 


ggcaacaatt 


aatagactgg 


atggaggcgg 


ataaagttgc 


aggaccactt 


ctgcgctcgg 


8520 


cccttccggc 


tggctggttt 


attgctgata 


aatctggagc 


cggtgagcgt 


gggtctcgcg 


8580 


gtatcattgc 


agcactgggg 


ccagatggta 


agccctcccg 


tatcgtagtt 


atctacacga 


8640 


cggggagtca 


ggcaactatg 


gatgaacgaa 


atagacagat 


cgctgagata 


ggtgcctcac 


8700 


tgattaagca 


ttggtaactg 


tcagaccaag 


tttactcata 


tatactttag 


attgatttaa 


8760 


aacttcattt 


ttaatttaaa 


aggatctagg 


tgaagatcct 


ttttgataat 


ctcatgacca 


8820 


aaatccctta 


acgtgagttt 


tcgttccact 


gagcgtcaga 


ccccgtagaa 


aagatcaaag 


8880 


gatcttcttg 


agatcctttt 


tttctgcgcg 


taatctgctg 


cttgcaaaca 


aaaaaaccac 


8940 


cgctaccagc 


ggtggtttgt 


ttgccggatc 


aagagctacc 


aactcttttt 


ccgaaggtaa 


9000 


ctggcttcag 


cagagcgcag 


ataccaaata 


ctgtccttct 


agtgtagccg 


tagttaggcc 


9060 


accacttcaa 


gaactctgta 


gcaccgccta 


catacctcgc 


tctgctaatc 


ctgttaccag 


9120 


tggctgctgc 


cagtggcgat 


aagtcgtgtc 


ttaccgggtt 


ggactcaaga 


cgatagttac 


9180 


cggataaggc 


gcagcggtcg 


ggctgaacgg 


ggggttcgtg 


cacacagccc 


agcttggagc 


9240 


gaacgaccta 


caccgaactg 


agatacctac 


agcgtgagct 


atgagaaagc 


gccacgcttc 


9300 


ccgaagggag 


aaaggcggac 


aggtatccgg 


taagcggcag 


ggtcggaaca 


ggagagcgca 


9360 


cgagggagct 


tccaggggga 


aacgcctggt 


atctttatag 


tcctgtcggg 


tttcgccacc 


9420 


tctgacttga 


gcgtcgattt 


ttgtgatgct 


cgtcaggggg 


gcggagccta 


tggaaaaacg 


9480 


ccagcaacgc 


ggccttttta 


cggttcctgg 


ccttttgctg 


gccttttgct 


cacatgttct 


9540 


ttcctgcgtt 


atcccctgat 


tctgtggata 


accgtattac 


cgcctttgag 


tgagctgata 


9600 


ccgctcgccg 


cagccgaacg 


accgagcgca 


gcgagtcagt 


gagcgaggaa 


gcggaagagc 


9660 


gcctgatgcg 


gtattttctc 


cttacgcatc 


tgtgcggtat 


ttcacaccgc 


atatggtgca 


9720 


ctctcagtac 


aatctgctct 


gatgccgcat 


agttaagcca 


gtatacactc 


cgctatcgct 


9780 


acgtgactgg 


gtcatggctg 


cgccccgaca 


cccgccaaca 


cccgctgacg 


cgccctgacg 


9840 


ggcttgtctg 


ctcccggcat 


ccgcttacag 


acaagctgtg 


accgtctccg 


ggagctgcat 


9900 


gtgtcagagg 


ttttcaccgt 


catcaccgaa 


acgcgcgagg 


cagctgcggt 


aaagctcatc 


9960 


agcgtggtcg 


tgaagcgatt 


cacagatgtc 


tgcctgttca 


tccgcgtcca 


gctcgttgag 


10020 
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tttctccaga 


agcgttaatg 


tctggcttct 


gataaagcgg 


gccatgttaa 


gggcggtttt 


10080 


ttcctgtttg 


gtcactgatg 


cctccgtgta 


agggggattt 


ctgttcatgg 


gggtaatgat 


10140 


accgatgaaa 


cgagagagga 


tgctcacgat 


acgggttact 


gatgatgaac 


atgcccggtt 


10200 


actggaacgt 


tgtgagggta 


aacaactggc 


ggtatggatg 


cggcgggacc 


agagaaaaat 


10260 


cactcagggt 


caatgccagc 


gcttcgttaa 


tacagatgta 


ggtgttccac 


agggtagcca 


10320 


gcagcatcct 


gcgatgcaga 


tccggaacat 


aatggtgcag 


ggcgctgact 


tccgcgtttc 


10380 


cagactttac 


gaaacacgga 


aaccgaagac 


cattcatgtt 


gttgctcagg 


tcgcagacgt 


10440 


tttgcagcag 


cagtcgcttc 


acgttcgctc 


gcgtatcggt 


gattcattct 


gctaaccagt 


10500 


aaggcaaccc 


cgccagccta 


gccgggtcct 


caacgacagg 


agcacgatca 


tgcgcacccg 


10560 


tggccaggac 


ccaacgctgc 


ccgagatgcg 


ccgcgtgcgg 


ctgctggaga 


tggcggacgc 


10620 


gatggatatg 


ttctgccaag 


ggttggtttg 


cgcattcaca 


gttctccgca 


agaattgatt 


10680 


ggctccaatt 


cttggagtgg 


tgaatccgtt 


agcgaggtgc 


cgccggcttc 


cattcaggtc 


10740 


gaggtggccc 


ggctccatgc 


accgcgacgc 


aacgcgggga 


ggcagacaag 


gtatagggcg 


10800 


gcgcctacaa 


tccatgccaa 


cccgttccat 


gtgctcgccg 


aggcggcata 


aatcgccgtg 


10860 


acgatcagcg 


gtccaatgat 


cgaagttagg 


ctggtaagag 


ccgcgagcga 


tccttgaagc 


10920 


tgtccctgat 


ggtcgtcatc 


tacctgcctg 


gacagcatgg 


cctgcaacgc 


gggcatcccg 


10980 


atgccgccgg 


aagcgagaag 


aatcataatg 


gggaaggcca 


tccagcctcg 


cgtcgcgaac 


11040 


gccagcaaga 


cgtagcccag 


cgcgtcggcc 


gccatgccgg 


cgataatggc 


ctgcttctcg 


11100 


ccgaaacgtt 


tggtggcggg 


accagtgacg 


aaggcttgag 


cgagggcgtg 


caagattccg 


11160 


aataccgcaa 


gcgacaggcc 


gatcatcgtc 


gcgctccagc 


gaaagcggtc 


ctcgccgaaa 


11220 


atgacccaga 


gcgctgccgg 


cacctgtcct 


acgagttgca 


tgataaagaa 


gacagtcata 


11280 


agtgcggcga 


cgatagtcat 


gccccgcgcc 


caccggaagg 


agctgactgg 


gttgaaggct 


11340 


ctcaagggca 


tcggtcgagg 


atccttcaat 


atgcgcacat 


acgctgttat 


gttcaaggtc 


11400 


ccttcgttta 


agaacgaaag 


cggtcttcct 


tttgagggat 


gtttcaagtt 


gttcaaatct 


11460 


atcaaatttg 


caaatcccca 


gtctgtatct 


agagcgttga 


atcggtgatg 


cgatttgtta 


11520 


attaaattga 


tggtgtcacc 


attaccaggt 


ctagatatac 


caatggcaaa 


ctgagcacaa 


11580 


caataccagt 


ccggatcaac 


tggcaccatc 


tctcccgtag 


tctcatctaa 


tttttcttcc 


11640 


ggatgaggtt 


ccagatatac 


cgcaacacct 


ttattatggt 


ttccctgagg 


gaataataga 


11700 


atgtcccatt 


cgaaatcacc 


aattctaaac 


ctgggcgaat 


tgtatttcgg 


gtttgttaac 


11760 


tcgttccagt 


caggaatgtt 


ccacgtgaag 


ctatcttcca 


gcaaagtctc 


cacttcttca 


11820 
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tcaaattgtg 


gagaatactc 


ccaatgctct 


tatctatggg 


acttccggga 


aacacagtac 


11880 


cgatacttcc 


caattcgtct 


tcagagctca 


ttgtttgttt 


gaagagacta 


atcaaagaat 


11940 


cgttttctca 


aaaaaattaa 


tatcttaact 


gatagtttga 


tcaaaggggc 


aaaacgtagg 


12000 


ggcaaacaaa 


cggaaaaatc 


gtttctcaaa 


ttttctgatg 


ccaagaactc 


taaccagtct 


12060 


tatctaaaaa 


ttgccttatg 


atccgtctct 


ccggttacag 


cctgtgtaac 


tgattaatcc 


12120 


tgcctttcta 


atcaccattc 


taatgtttta 


attaagggat 


tttgtcttca 


ttaacggctt 


12180 


tcgctcataa 


aaatgttatg 


acgttttgcc 


cgcaggcggg 


aaaccatcca 


cttcacgaga 


12240 


ctgatctcct 


ctgccggaac 


accgggcatc 


tccaacttat 


aagttggaga 


aataagagaa 


12300 


tttcagattg 


agagaatgaa 


aaaaaaaaac 


ccttagttca 


taggtccatt 


ctcttagcgc 


12360 


aactacagag 


aacaggggca 


caaacaggca 


aaaaacgggc 


acaacctcaa 


tggagtgatg 


12420 


caacctgcct 


ggagtaaatg 


atgacacaag 


gcaattgacc 


cacgcatgta 


tctatctcat 


12480 


tttcttacac 


cttctattac 


cttctgctct 


ctctgatttg 


gaaaaagctg 


aaaaaaaagg 


12540 


ttgaaaccag 


ttccctgaaa 


ttattcccct 


acttgactaa 


taagtatata 


aagacggtag 


12600 


gtattgattg 


taattctgta 


aatctatttc 


ttaaacttct 


taaattctac 


ttttatagtt 


12660 


agtctttttt 


ttagttttaa 


aacaccaaga 


acttagtttc 


gaataaacac 


acataaacaa 


12720 


acaagcttac 


aaaacaaatt 


cacc atg get gca tat gca get cag 


ggc tat 


12771 



Met Ala Ala Tyr Ala Ala Gin Gly Tyr 
1 5 

tct gtt get gca aea etg ggc ttt ggt 12819 
Ser Val Ala Ala Thr Leu Gly Phe Gly 
20 25 

ggg ate gat cot aac ate agg acc ggg 12867 
Gly lie Asp Pro Asn lie Arg Thr Gly 
35 40 

age cec ate aeg tae tec acc tac ggc 12915 
Ser Pro lie Thr Tyr Ser Thr Tyr Gly 
50 55 

tgc teg ggg ggc get tat gae ata ata 12963 
Cys Ser Gly Gly Ala Tyr Asp lie lie 
65 70 

acg gat gee aca tec ate ttg ggc att 13011 
Thr Asp Ala Thr Ser lie Leu Gly lie 
85 

gag act gcg ggg geg aga etg gtt gtg 13059 
Glu Thr Ala Gly Ala Arg Leu Val Val 
100 105 

ggc tec gtc act gtg ccc cat cec aac 13107 



aag gtg eta gta etc aac cec 
Lye Val Leu Val Leu Asn Pro 
10 15 

get tac atg tec aag get eat 
Ala Tyr Met Ser Lys Ala His 
30 

gtg aga aea att ace act ggc 
Val Arg Thr lie Thr Thr Gly 
45 

aag ttc ett gcc gac ggc ggg 
Lys Phe Leu Ala Asp Gly Gly 
60 

att tgt gae gag tgc cac tec 
lie Cys Asp Glu Cys His Ser 
75 80 

ggc act gtc ett gac caa gca 
Gly Thr Val Leu Asp Gin Ala 
90 95 

etc gcc acc gcc acc cet ccg 
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Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn 
110 115 120 

ate gag gag gtt get ctg tec ace acc gga gag ate eet ttt tac gge 13155 
lie Glu Glu Val Ala Leu Ser Thr Thr Gly Glu He Pro Phe Tyr Gly 
125 130 135 

aag get ate ccc etc gaa gta ate aag ggg ggg aga eat etc ate ttc 13203 
Lys Ala He Pro Leu Glu Val He Lys Gly Gly Arg His Leu He Phe 
140 145 150 

tgt cat tea aag aag aag tge gac gaa etc gee gca aag ctg gtc gca 13251 
Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala 
155 160 165 

ttg gge ate aat gcc gtg gee tac tac egc ggt ctt gac gtg tec gtc 13299 
Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val 
170 175 180 185 

ate ccg acc age gge gat gtt gtc gtc gtg gca acc gat gcc etc atg 13347 
He Pro Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met 
190 195 200 

acc gge tat ace gge gac ttc gac teg gtg at a gac tge aat aeg tgt 13395 
Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys 
205 210 215 

gtc acc cag aca gtc gat ttc age ctt gac cct acc ttc acc att gag 13443 
Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu 
220 225 230 

aca ate aeg etc ccc caa gat get gtc tec egc act caa cgt egg gge 13491 
Thr He Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly 
235 240 245 

agg act gge agg ggg aag cca gge ate tac aga ttt gtg gca ccg ggg 13539 
Arg Thr Gly Arg Gly Lys Pro Gly He Tyr Arg Phe Val Ala Pro Gly 
250 255 260 265 

gag egc ccc tec gge atg ttc gac teg tec gtc etc tgt gag tge tat 13587 
Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr 
270 275 280 

gac gca gge tgt get tgg tat gag etc aeg ccc gcc gag act aca gtt 13635 
Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val 
285 290 295 

agg eta cga gcg tac atg aac acc ccg ggg ctt ccc gtg tge cag gac 13683 
Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp 
300 305 310 

cat ctt gaa ttt tgg gag gge gtc ttt aca gge etc act cat ata gat 13731 
His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His He Asp 
315 320 325 

gcc cac ttt eta tec cag aca aag cag agt ggg gag aac ctt cct tac 13779 
Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr 
330 335 340 345 
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ctg gta gcg tac caa gcc acc gtg tgc get agg get caa gee cct cec 13827 
Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro 
350 355 360 

cca teg tgg gac cag atg tgg aag tgt ttg att cgc etc aag cec acc 13875 
Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr 
365 370 375 

etc cat ggg cca aca cec ctg eta tac aga ctg ggc get gtt cag aat 13923 
Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn 
380 385 390 

gaa ate ace ctg acg eac cca gtc acc aaa tac ate atg aca tgc atg 13971 
Glu lie Thr Leu Thr His Pro Val Thr Lys Tyr lie Met Thr Cys Met 
395 400 405 

teg gcc gac ctg gag gtc gtc acg age ace tgg gtg etc gtt ggc ggc 14019 
Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly 
410 415 420 425 

gtc ctg get get ttg gcc gcg tat tgc ctg tea aca ggc tgc gtg gtc 14067 
Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val 
430 435 440 

ata gtg ggc agg gtc gtc ttg tec ggg aag ccg gea ate ata cct gac 14115 
lie Val Gly Arg Val Val Leu Ser Gly Lys Pro Ala lie lie Pro Asp 
445 450 455 

agg gaa gtc etc tac ega gag ttc gat gag atg gaa gag tgc tct cag 14163 
Arg Glu Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin 
460 465 470 

eac tta ccg tac ate gag caa ggg atg atg etc gee gag cag ttc aag 14211 
His Leu Pro Tyr lie Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys 
475 480 485 

cag aag gcc etc ggc etc ctg cag acc gcg tec cgt cag gea gag gtt 14259 
Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val 
490 495 500 505 

ate gcc cct get gtc cag acc aac tgg caa aaa etc gag acc ttc tgg 14307 
lie Ala Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Trp 
510 515 520 

gcg aag eat atg tgg aac ttc ate agt ggg ata caa tac ttg gcg ggc 143 55 
Ala Lys His Met Trp Asn Phe lie Ser Gly lie Gin Tyr Leu Ala Gly 
525 530 535 

ttg tea acg ctg cct ggt aac cec gcc att get tea ttg atg get ttt 14403 
Leu Ser Thr Leu Pro Gly Asn Pro Ala lie Ala Ser Leu Met Ala Phe 
540 545 550 

aca get get gtc acc age cca eta acc act age caa acc etc etc ttc 14451 
Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe 
555 560 565 

aac ata ttg ggg ggg tgg gtg get gcc cag etc gee gcc ecc ggt gcc 14499 
Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly Ala 
570 575 580 585 
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get act gcc ttt gtg ggc get ggc tta get ggc gee gee ate ggc agt 14547 
Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly Ser 
590 595 600 

gtt gga ctg ggg aag gtc etc ata gac ate ctt gca ggg tat ggc gcg 14595 
Val Gly Leu Gly Lys Val Leu He Asp He Leu Ala Gly Tyr Gly Ala 
605 610 615 

ggc gtg gcg gga get ctt gtg gca ttc aag ate atg age ggt gag gtc 14643 
Gly Val Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu Val 
620 625 630 

cce tec aeg gag gac ctg gtc aat eta ctg cec gee ate etc teg ecc 14691 
Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro 
635 640 645 

gga gcc etc gta gtc ggc gtg gtc tgt gca gca ata ctg cgc egg cac 1473 9 
Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His 
650 655 660 665 

gtt ggc ccg ggc gag ggg gca gtg cag tgg atg aac egg ctg ata gcc 14787 
Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala 
670 675 680 

ttc gcc tec egg ggg aac cat gtt tec cce acg cac tac gtg ccg gag 14835 
Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu 
685 690 695 

age gat gca get gcc cgc gtc act gcc ata etc age age etc act gta 14883 
Ser Asp Ala Ala Ala Arg Val Thr Ala He Leu Ser Ser Leu Thr Val 
700 705 710 

ace cag etc ctg agg ega ctg cac cag tgg ata age teg gag tgt ace 14931 
Thr Gin Leu Leu Arg Arg Leu His Gin Trp He Ser Ser Glu Cys Thr 
715 720 725 

act cca tgc tec ggt tec tgg eta agg gac ate tgg gac tgg ata tgc 14979 
Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp He Cys 
730 735 740 745 

gag gtg ttg age gac ttt aag ace tgg eta aaa get aag etc atg cca 15027 
Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro 

750 755 760 

cag ctg cet ggg ate cce ttt gtg tec tgc cag cgc ggg tat aag ggg 15075 
Gin Leu Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly 
765 770 775 

gtc tgg ega ggg gac ggc ate atg cac act cgc tgc cac tgt gga get 15123 
Val Trp Arg Gly Asp Gly He Met His Thr Arg Cys His Cys Gly Ala 
780 785 790 

gag ate act gga cat gtc aaa aac ggg acg atg agg ate gtc ggt cct 15171 
Glu He Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly Pro 
795 800 805 

agg ace tgc agg aac atg tgg agt ggg ace ttc ecc att aat gcc tac 15219 
Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe Pro He Asn Ala Tyr 
810 815 820 825 
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acc acg ggc ccc tgt acc ccc ctt cct gcg ccg aac tac acg ttc gcg 15267 
Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala 
830 835 840 

eta tgg agg gtg tct gca gag gaa tac gtg gag ata agg cag gtg ggg 15315 
Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Gin Val Gly 
845 850 855 

gac ttc cac tac gtg acg ggt atg act act gac aat ctt aaa tgc ccg 15363 
Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro 
860 865 870 

tgc cag gtc cca teg eec gaa ttt ttc aca gaa ttg gac ggg gtg cgc 15411 
Cys Gin Val Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg 
875 880 885 

eta cat agg ttt gcg ccc ccc tgc aag ccc ttg ctg egg gag gag gta 15459 
Leu His Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val 
890 895 900 905 

tea ttc aga gta gga etc cac gaa tac ccg gta ggg teg caa tta cct 15507 
Ser Phe Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro 
910 915 920 

tgc gag ccc gaa ccg gac gtg gee gtg ttg acg tec atg etc act gat 15555 
Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp 
925 930 935 

ccc tec cat ata aca gca gag gcg gee ggg cga agg ttg gcg agg gga 15603 
Pro Ser His He Thr Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg Gly 
940 945 950 

tea ccc ccc tct gtg gee age tec teg get age cag eta tec get cca 15651 
Ser Pro Pro Ser Val Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro 
955 960 965 

tct etc aag gca act tgc acc get aac cat gac tec cct gat get gag 15699 
Ser Leu Lys Ala Thr Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu 
970 975 980 985 

etc ata gag gee aac etc eta tgg agg cag gag atg ggc ggc aac ate 15747 
Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He 
990 995 1000 

ace agg gtt gag tea gaa aac aaa gtg gtg att ctg gac tec ttc gat 15795 
Thr Arg Val Glu Ser Glu Asn Lys Val Val He Leu Asp Ser Phe Asp 
1005 1010 1015 

ccg ctt gtg gcg gag gag gac gag egg gag ate tec gta ccc gca gaa 15843 
Pro Leu Val Ala Glu Glu Asp Glu Arg Glu He Ser Val Pro Ala Glu 
1020 1025 1030 

ate ctg egg aag tct egg aga ttc gee cag gee ctg ccc gtt tgg gcg 15891 
He Leu Arg Lys Ser Arg Arg Phe Ala Gin Ala Leu Pro Val Trp Ala 
1035 1040 1045 

egg ccg gac tat aac ccc ccg eta gtg gag acg tgg aaa aag eec gac 15939 
Arg Pro Asp Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp 
1050 1055 1060 1065 
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tac gaa cca cct gtg gtc cat ggc tgc ccg ctt cca cct cca aag tec 
Tyr Glu Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser 
1070 1075 1080 



15987 



cct cct gtg cct ccg cct egg aag aag egg acg gtg gtc etc act gaa 
Pro Pro Val Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu 
1085 1090 1095 



16035 



tea ace eta tct act gee ttg gee gag etc gee ace aga age ttt gge 
Ser Thr Leu Ser Thr Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly 
1100 1105 1110 



16083 



age tec tea act tec ggc att acg ggc gae aat acg aca aca tec tct 
Ser Ser Ser Thr Ser Gly lie Thr Gly Asp Asn Thr Thr Thr Ser Ser 
1115 1120 1125 



16131 



gag eee gee cct tct gge tgc cec eec gae tee gae get gag tec tat 
Glu Pro Ala Pro Ser Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr 
1130 1135 1140 1145 



16179 



tee tee atg eec cec ctg gag ggg gag cet ggg gat ccg gat ctt age 
Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser 
1150 1155 1160 



16227 



gae ggg tea tgg tea acg gtc agt agt gag gee aac gcg gag gat gtc 
Asp Gly Ser Trp Ser Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val 
1165 1170 1175 



16275 



gtg tgc tgc tea atg tct tac tct tgg aca gge gca etc gtc ace ccg 
Val Cys Cys Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro 
1180 1185 1190 



16323 



tgc gcc gcg gaa gaa cag aaa ctg cec ate aat gca eta age aac teg 
Cys Ala Ala Glu Glu Gin Lys Leu Pro lie Asn Ala Leu Ser Asn Ser 
1195 1200 1205 



16371 



ttg eta cgt cac cac aat ttg gtg tat tec acc ace tea cgc agt get 
Leu Leu Arg His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala 
1210 1215 1220 1225 



16419 



tgc caa agg cag aag aaa gtc aca ttt gae aga ctg caa gtt ctg gae 
Cys Gin Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp 
1230 1235 1240 



16467 



age cat tac cag gae gta etc aag gag gtt aaa gca gcg gcg tea aaa 
Ser His Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala Ala Ser Lys 
1245 1250 1255 



16515 



gtg aag get aac ttg eta tec gta gag gaa get tgc age ctg acg cec 
Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro 
1260 1265 1270 



16563 



cca cac tea gcc aaa tec aag ttt ggt tat ggg gca aaa gae gtc cgt 
Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg 

1275 1280 1285 



16611 



tgc cat gee aga aag gee gta acc cac ate aac tec gtg tgg aaa gae 
Cys His Ala Arg Lys Ala Val Thr His lie Asn Ser Val Trp Lys Asp 
1290 1295 1300 1305 



16659 
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ctt ctg gaa gac aat gta aca cca ata gac act acc ate atg get aag 
Leu Leu Glu Asp Asn Val Thr Pro lie Asp Thr Thr lie Met Ala Lys 
1310 1315 1320 



16707 



aac gag gtt ttc tgc gtt cag cct gag aag ggg ggt cgt aag cea get 
Asn Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala 
1325 1330 1335 



16755 



cgt etc ate gtg ttc ccc gat ctg ggc gtg egc gtg tgc gaa aag atg 
Arg Leu lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met 
1340 1345 1350 



16803 



get ttg tae gac gtg gtt aea aag etc cee ttg gee gtg atg gga age 
Ala Leu Tyr Asp Val Val Thr Lys Leu Pro Leu Ala Val Met Gly Ser 
1355 1360 1365 



16851 



tec tae gga ttc caa tae tea eea gga eag egg gtt gaa tte etc gtg 
Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val 
1370 1375 1380 1385 



16899 



caa gcg tgg aag tee aag aaa acc cca atg ggg ttc teg tat gat ace 
Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr 
1390 1395 1400 



16947 



egc tgc ttt gac tec aea gtc act gag age gac ate cgt acg gag gag 
Arg Cys Phe Asp Ser Thr Val Thr Glu Ser Asp lie Arg Thr Glu Glu 
1405 1410 1415 



16995 



gca ate tae caa tgt tgt gac etc gac ccc caa gee egc gtg gee ate 
Ala lie Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala He 
1420 1425 1430 



17043 



aag tec etc ace gag agg ctt tat gtt ggg ggc cct ctt acc aat tea 
Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser 
1435 1440 1445 



17091 



agg ggg gag aac tgc ggc tat egc agg tgc egc gcg age ggc gta ctg 
Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu 
1450 1455 1460 1465 



17139 



aca act age tgt ggt aac acc etc act tgc tae ate aag gcc egg gca 
Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr He Lys Ala Arg Ala 
1470 1475 1480 



17187 



gee tgt ega gcc gca ggg etc cag gac tgc acc atg etc gtg tgt ggc 
Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly 
1485 1490 1495 



17235 



gac gac tta gtc gtt ate tgt gaa age gcg ggg gtc cag gag gac gcg 
Asp Asp Leu Val Val He Cys Glu Ser Ala Gly Val Gin Glu Asp Ala 
1500 1505 1510 



17283 



gcg age ctg aga gcc ttc acg gag get atg acc agg tae tec gcc ccc 
Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro 
1515 1520 1525 



17331 



cct ggg gac ccc eea caa cca gaa tae gac ttg gag etc ata aca tea 
Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser 
1530 1535 1540 1545 



17379 
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tgc tec tec aac gtg tea gtc gee cac gac ggc get gga aag agg gtc 17427 
Cys Ser Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg Val 
1550 1555 1560 

tac tac etc acc cgt gac cct aca acc ccc etc gcg aga get gcg tgg 17475 
Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp 
1565 1570 1575 

gag aca gca aga cac act cca gtc aat tec tgg eta ggc aac ata ate 17523 
Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn lie lie 
1580 1585 1590 

atg ttt gee cec aca ctg tgg gcg agg atg ata ctg atg acc cat tte 17571 
Met Phe Ala Pro Thr Leu Trp Ala Arg Met lie Leu Met Thr His Phe 
1595 1600 1605 

ttt age gtc ett ata gcc agg gac cag ctt gaa cag gee etc gat tgc 17619 
Phe Ser Val Leu lie Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp Cys 
1610 1615 1620 1625 

gag ate tac ggg gcc tgc tac tec ata gaa cca ctg gat eta cct cca 17667 
Glu lie Tyr Gly Ala Cys Tyr Ser lie Glu Pro Leu Asp Leu Pro Pro 
1630 1635 1640 

ate att caa aga etc cat ggc etc age gca ttt tea etc cac agt tac 17715 
He He Gin Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr 
1645 1650 1655 

tet cca ggt gaa ate aat agg gtg gcc gca tgc etc aga aaa ctt ggg 17763 
Ser Pro Gly Glu He Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly 
1660 1665 1670 

gta eeg ccc ttg ega get tgg aga cac egg gcc egg age gtc cgc get 17811 
Val Pro Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala 
1675 1680 1685 

agg ctt ctg gcc aga gga ggc agg get gcc ata tgt ggc aag tac etc 17859 
Arg Leu Leu Ala Arg Gly Gly Arg Ala Ala He Cys Gly Lys Tyr Leu 
1690 1695 1700 1705 

tte aac tgg gca gta aga aca aag etc aaa etc act cca ata gcg gee 17907 
Phe Asn Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro He Ala Ala 
1710 1715 1720 

get ggc cag ctg gac ttg tec ggc tgg tte aeg get ggc tac age ggg 17955 
Ala Gly Gin Leu Asp Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser Gly 
1725 1730 1735 

gga gac att tat cac age gtg tct cat gcc egg ccc cgc tgg ate tgg 18003 
Gly Asp He Tyr His Ser Val Ser His Ala Arg Pro Arg Trp He Trp 
1740 1745 1750 

ttt tgc eta etc ctg ctt get gca ggg gta ggc ate tac etc etc cec 18051 
Phe Cys Leu Leu Leu Leu Ala Ala Gly Val Gly He Tyr Leu Leu Pro 
1755 1760 1765 

aac ega tgaaggttgg ggtaaacact ccggectaaa aaaaaaaaaa aatctagaac 18107 

Asn Arg 

1770 
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ccgagtcgac 


tttgttccca 


ctgtactttt 


agctcgtaca 


aaatacaata 


tacttttcat 


18167 


ttctccgtaa 


acaacatgtt 


ttcccatgta 


atatcctttt 


ctatttttcg 


ttccgttacc 


18227 


aactttacac 


atactttata 


tagctattca 


cttctataca 


ctaaaaaact 


aagacaattt 


18287 


taattttgct 


gcctgccata 


tttcaatttg 


ttataaattc 


ctataattta 


tcctattagt 


18347 


agctaaaaaa 


agatgaatgt 


gaatcgaatc 


ctaagagaat 


tggatctgat 


ccacaggacg 


18407 


ggtgtggtcg 


ccatgatcgc 


gtagtcgata 


gtggctccaa 


gtagcgaagc 


gagcaggact 


18467 


gggcggcggc 


caaagcggtc 


ggacagtgct 


ccgagaacgg 


gtgcgcatag 


aaattgcatc 


18527 


aacgcatata 


gcgctagcag 


cacgccatag 


tgactggcga 


tgctgtcgga 


atggacgata 


18587 


tcccgcaaga 


ggcccggcag 


taccggcata 


accaagccta 


tgcctacagc 


atccagggtg 


18647 


acggtgccga 


ggatgacgat 


gagcgcattg 


ttagatttca 


tacacggtgc 


ctgactgcgt 


18707 


tagcaattta 


actgtgataa 


actaccgcat 


taaagctttt 


tctttccaat 


tttttttttt 


18767 


tcgtcattat 


aaaaatcatt 


acgaccgaga 


ttcccgggta 


ataactgata 


taattaaatt 


18827 


gaagctctaa 


tttgtgagtt 


tagtatacat 


gcatttactt 


ataatacagt 


tttttagttt 


18887 


tgctggccgc 


atcttctcaa 


atatgcttcc 


cagcctgctt 


ttctgtaacg 


ttcaccctct 


18947 


accttagcat 


cccttccctt 


tgcaaatagt 


cctcttccaa 


caataataat 


gtcagatcct 


19007 


gtagagacca 


catcatccac 


ggttctatac 


tgttgaccca 


atgcgtctcc 


cttgtcatct 


19067 


aaacccacac 


cgggtgtcat 


aatcaaccaa 


tcgtaacctt 


catctcttcc 


acccatgtct 


19127 


ctttgagcaa 


taaagccgat 


aacaaaatct 


ttgtcgctct 


tcgcaatgtc 


aacagtaccc 


19187 


ttagtatatt 


ctccagtaga 


tagggagccc 


ttgcatgaca 


attctgctaa 


catcaaaagg 


19247 


cctctaggtt 


cctttgttac 


ttcttctgcc 


gcctgcttca 


aaccgctaac 


aatacctggg 


19307 


cccaccacac 


cgtgtgcatt 


cgtaatgtct 


gcccattctg 


ctattctgta 


tacacccgca 


19367 


gagtactgca 


atttgactgt 


attaccaatg 


tcagcaaatt 


ttctgtcttc 


gaagagtaaa 


19427 


aaattgtact 


tggcggataa 


tgcctttagc 


ggcttaactg 


tgccctccat 


ggaaaaatca 


19487 


gtcaagatat 


ccacatgtgt 


ttttagtaaa 


caaattttgg 


gacctaatgc 


ttcaactaac 


19547 


tccagtaatt 


ccttggtggt 


acgaacatcc 


aatgaagcac 


acaagtttgt 


ttgcttttcg 


19607 


tgcatgatat 


taaatagctt 


ggcagcaaca 


ggactaggat 


gagtagcagc 


acgttcctta 


19667 


tatgtagctt 


tcgacatgat 


ttatcttcgt 


ttcctgcagg 


tttttgttct 


gtgcagttgg 


19727 


gttaagaata 


ctgggcaatt 


tcatgtttct 


tcaacactac 


atatgcgtat 


atataccaat 


19787 


ctaagtctgt 


gctccttcct 


tcgttcttcc 


ttctgttcgg 


agattaccga 


atcaaaaaaa 


19847 


tttcaaggaa 


accgaaatca 


aaaaaaagaa 


taaaaaaaaa 


atgatgaatt 


gaaaagctta 


19907 
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tcgat 



19912 



<210> 9 
<211> 1771 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: pd.deltaNS3NS5 



Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
15 10 15 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His 
20 25 30 

Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr Thr Gly 
35 40 45 

Ser Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly 
50 55 60 

Cys Ser Gly Gly Ala Tyr Asp He lie He Cys Asp Glu Cys His Ser 
65 70 75 80 

Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala 
85 90 95 

Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro 
100 105 110 

Gly Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser 
115 120 125 

Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val 
130 135 140 

He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys 
145 150 155 160 

Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala 
165 170 175 

Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val 
180 185 190 

Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 

195 200 205 

Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
210 215 220 

Ser Leu Asp Pro Thr Phe Thr He Glu Thr He Thr Leu Pro Gin Asp 
225 230 235 240 

Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro 



<400> 



9 



245 



250 



255 
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Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe 
260 265 270 

Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr 
275 280 285 

Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn 
290 295 300 

Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly 
305 310 315 320 

Val Phe Thr Gly Leu Thr His lie Asp Ala His Phe Leu Ser Gin Thr 
325 330 335 

Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr 
340 345 350 

Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp 
355 360 365 

Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu 
370 375 380 

Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie Thr Leu Thr His Pro 
385 390 395 400 

Val Thr Lys Tyr He Met Thr Cys Met Ser Ala Asp Leu Glu Val Val 
405 410 415 

Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
420 425 430 

Tyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg Val Val Leu 
435 440 445 

Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Arg Glu 
450 455 460 

Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin 
465 470 475 480 

Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu 
485 490 495 

Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala Val Gin Thr 
500 505 510 

Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe 
515 520 525 

He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 
530 535 540 

Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro 
545 550 555 560 

Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val 
565 570 575 
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Ala Ala Gin lieu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 
580 585 590 

Gly Leu Ala Gly Ala Ala lie Gly Ser Val Gly Leu Gly Lys Val Leu 
595 600 605 

He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 
610 615 620 

Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 
625 630 635 640 

Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val 
645 650 655 

Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
660 665 670 

Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 
675 680 685 

Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
690 695 700 

Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu 
705 710 715 720 

His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp 
725 730 735 

Leu Arg Asp He Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys 
740 745 750 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe 
755 760 765 

Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He 
770 775 780 

Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His Val Lys 
785 790 795 800 

Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
805 810 815 

Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
820 825 830 

Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser Ala Glu 
835 840 845 

Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe His Tyr Val Thr Gly 
850 855 860 

Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser Pro Glu 
865 870 875 880 



Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro 
885 890 895 
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Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His 
900 905 910 

Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val 
915 920 925 

Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His lie Thr Ala Glu 
930 935 940 

Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala Ser 
945 950 955 960 

Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 
965 970 975 

Ala Asn His Asp Ser Pro Asp Ala Glu Leu lie Glu Ala Asn Leu Leu 
980 985 990 

Trp Arg Gin Glu Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn 
995 1000 1005 

Lys Val Val He Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp 
1010 1015 1020 

Glu Arg Glu lie Ser Val Pro Ala Glu lie Leu Arg Lys Ser Arg Arg 
025 1030 1035 1040 

Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro 
1045 1050 1055 

Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His 
1060 1065 1070 



Gly Cys Pro Leu Pro Pro Pro Lys 
1075 1080 

Lys Lys Arg Thr Val Val Leu Thr 
1090 1095 

Ala Glu Leu Ala Thr Arg Ser Phe 
105 1110 

Thr Gly Asp Asn Thr Thr Thr Ser 
1125 

Pro Pro Asp Ser Asp 
1140 

Gly Glu Pro Gly Asp Pro Asp Leu 
1155 1160 

Ser Ser Glu Ala Asn Ala Glu Asp 
1170 1175 

Ser Trp Thr Gly Ala Leu Val Thr 
185 1190 



Ser Pro Pro Val Pro Pro Pro Arg 
1085 

Glu Ser Thr Leu Ser Thr Ala Leu 
1100 

Gly Ser Ser Ser Thr Ser Gly He 
1115 1120 

Ser Glu Pro Ala Pro Ser Gly Cys 
1130 1135 

Leu Glu 



Ser Asp Gly Ser Trp Ser Thr Val 
1165 

Val Val Cys Cys Ser Met Ser Tyr 
1180 

Pro Cys Ala Ala Glu Glu Gin Lys 
1195 1200 



Ala Glu Ser Tyr Ser Ser Met Pro Pro 
1145 1150 



Leu Pro He Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu 
1205 1210 1215 
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Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val 
1220 1225 1230 

Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu 
1235 1240 1245 

Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser 
1250 1255 1260 

Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys 
265 1270 1275 1280 

Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val 
1285 1290 1295 

Thr His lie Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr 
1300 1305 1310 

Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin 
1315 1320 1325 

Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp 
1330 1335 1340 

Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr 
345 1350 1355 1360 

Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser 
1365 1370 1375 

Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys 
1380 1385 1390 

. Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 
1395 1400 1405 

Thr Glu Ser Asp lie Arg Thr Glu Glu Ala He Tyr Gin Cys Cys Asp 
1410 1415 1420 

Leu Asp Pro Gin Ala Arg Val Ala He Lys Ser Leu Thr Glu Arg Leu 
425 1430 1435 1440 

Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr 
1445 1450 1455 

Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr 
1460 1465 1470 

Leu Thr Cys Tyr He Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu 
1475 1480 1485 

Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys 
1490 1495 150O 

Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr 
505 1510 1515 1520 

Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro 
1525 1530 1535 
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Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val 
1540 1545 1550 

Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro 
1555 1560 1565 

Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro 
1570 1575 1580 

Val Asn Ser Trp Leu Gly Asn lie lie Met Phe Ala Pro Thr Leu Trp 
585 1590 1595 1600 

Ala Arg Met lie Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg 
1605 1610 1615 

Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He Tyr Gly Ala Cys Tyr 
1620 1625 1630 

Ser He Glu Pro Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly 
1635 1640 1645 

Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg 
1650 1655 1660 

Val Ala Ala Cys Leu TUrg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp 
665 1670 1675 1680 

Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly 
1685 1690 1695 

Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr 
1700 1705 1710 

Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly Gin Leu Asp Leu Ser 
1715 1720 1725 

Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val 
1730 1735 1740 

Ser His Ala Arg Pro Arg Trp He Trp Phe Cys Leu Leu Leu Leu Ala 
745 1750 1755 1760 

Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg 
1765 1770 



<210> 10 

<211> 19798 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
pd.deltaNS3NS5.pj 

<220> 

<221> CDS 

<222> (12679) (17991) 
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<400> 10 
atcgatccta 


ccccttgcgc 


taaagaagta 


tatgtgccta 


ctaacgcttg 


tctttgtctc 


60 


tgtcactaaa 


cactggatta 


ttactcccag 


atacttattt 


tggactaatt 


taaatgattt 


120 


cggatcaacg 


ttcttaatat 


cgctgaatct 


tccacaattg 


atgaaagtag 


ctaggaagag 


180 


gaattggtat 


aaagtttttg 


tttttgtaaa 


tctcgaagta 


tactcaaacg 


aatttagtat 


240 


tttctcagtg 


atctcccaga 


tgctttcacc 


ctcacttaga 


agtgctttaa 


gcattttttt 


300 


actgtggcta 


tttcccttat 


ctgcttcttc 


cgatgattcg 


aactgtaatt 


gcaaactact 


360 


tacaatatca 


gtgatatcag 


attgatgttt 


ttgtccatag 


taaggaataa 


ttgtaaattc 


420 


ccaagcagga 


atcaatttct 


ttaatgaggc 


ttccagaatt 


gttgcttttt 


gcgtcttgta 


480 


tttaaactgg 


agtgatttat 


tgacaatatc 


gaaactcagc 


gaattgctta 


tgatagtatt 


540 


atagctcatg 


aatgtggctc 


tcttgattgc 


tgttccgtta 


tgtgtaatca 


tccaacataa 


600 


ataggttagt 


tcagcagcac 


ataatgctat 


tttctcacct 


gaaggtcttt 


caaacctttc 


660 


cacaaactga 


cgaacaagca 


ccttaggtgg 


tgttttacat 


aatatatcaa 


attgtggcat 


720 


gcttagcgcc 


gatcttgtgt 


gcaattgata 


tctagtttca 


actactctat 


ttatcttgta 


780 


tcttgcagta 


ttcaaacacg 


ctaactcgaa 


aaactaactt 


taattgtcct 


gtttgtctcg 


840 


cgttctttcg 


aaaaatgcac 


cggccgcgca 


ttatttgtac 


tgcgaaaata 


attggtactg 


900 


cggtatcttc 


atttcatatt 


ttaaaaatgc 


acctttgctg 


cttttcctta 


atttttagac 


960 


ggcccgcagg 


ttcgttttgc 


ggtactatct 


tgtgataaaa 


agttgttttg 


acatgtgatc 


1020 


tgcacagatt 


ttataatgta 


ataagcaaga 


atacattatc 


aaacgaacaa 


tactggtaaa 


1080 


agaaaaccaa 


aatggacgac 


attgaaacag 


ccaagaatct 


gacggtaaaa 


gcacgtacag 


1140 


cttatagcgt 


ctgggatgta 


tgtcggctgt 


ttattgaaat 


gattgctcct 


gatgtagata 


1200 


ttgatataga 


gagtaaacgt 


aagtctgatg 


agctactctt 


tccaggatat 


gtcataaggc 


1260 


ccatggaatc 


tctcacaacc 


ggtaggccgt 


atggtcttga 


ttctagcgca 


gaagattcca 


1320 


gcgtatcttc 


tgactccagt 


gctgaggtaa 


ttttgcctgc 


tgcgaagatg 


gttaaggaaa 


1380 


ggtttgattc 


gattggaaat 


ggtatgctct 


cttcacaaga 


agcaagtcag 


gctgccatag 


1440 


atttgatgct 


acagaataac 


aagctgttag 


acaatagaaa 


gcaactatac 


aaatctattg 


1500 


ctataataat 


aggaagattg 


cccgagaaag 


acaagaagag 


agctaccgaa 


atgctcatga 


1560 


gaaaaatgga 


ttgtacacag 


ttattagtcc 


caccagctcc 


aacggaagaa 


gatgttatga 


1620 


agctcgtaag 


cgtcgttacc 


caattgctta 


ctttagttcc 


accagatcgt 


caagctgctt 


1680 


taataggtga 


tttattcatc 


ccggaatctc 


taaaggatat 


attcaatagt 


ttcaatgaac 


1740 
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tggcggcaga 


gaatcgttta 


cagcaaaaaa 


agagtgagtt 


ggaaggaagg 


actgaagtga 


1800 


accatgctaa 


tacaaatgaa 


gaagttccct 


ccaggcgaac 


aagaagtaga 


gacacaaatg 


1860 


caagaggagc 


atataaatta 


caaaacacca 


tcactgaggg 


ccctaaagcg 


gttcccacga 


1920 


aaaaaaggag 


agtagcaacg 


agggtaaggg 


gcagaaaatc 


acgtaatact 


tctagggtat 


1980 


gatccaatat 


caaaggaaat 


gatagcattg 


aaggatgaga 


ctaatccaat 


tgaggagtgg 


2040 


cagcatatag 


aacagctaaa 


gggtagtgct 


gaaggaagca 


tacgataccc 


cgcatggaat 


2100 


gggataatat 


cacaggaggt 


actagactac 


ctttcatcct 


acataaatag 


acgcatataa 


2160 


gtacgcattt 


aagcataaac 


acgcactatg 


ccgttcttct 


catgtatata 


tatatacagg 


2220 


caacacgcag 


atataggtgc 


gacgtgaaca 


gtgagctgta 


tgtgcgcagc 


tcgcgttgca 


2280 


ttttcggaag 


cgctcgtttt 


cggaaacgct 


ttgaagttcc 


tattccgaag 


ttcctattct 


2340 


ctagaaagta 


taggaacttc 


agagcgcttt 


tgaaaaccaa 


aagcgctctg 


aagacgcact 


2400 


ttcaaaaaac 


caaaaacgca 


ccggactgta 


acgagctact 


aaaatattgc 


gaataccgct 


2460 


tccacaaaca 


ttgctcaaaa 


gtatctcttt 


gctatatatc 


tctgtgctat 


atccctatat 


2520 


aacctaccca 


tccacctttc 


gctccttgaa 


cttgcatcta 


aactcgacct 


ctacatcaac 


2580 


aggcttccaa 


tgctcttcaa 


attttactgt 


caagtagacc 


catacggctg 


taatatgctg 


2640 


ctcttcataa 


tgtaagctta 


tctttatcga 


atcgtgtgaa 


aaactactac 


cgcgataaac 


2700 


ctttacggtt 


ccctgagatt 


gaattagttc 


ctttagtata 


tgatacaaga 


cacttttgaa 


2760 


ctttgtacga 


cgaattttga 


ggttcgccat 


cctctggcta 


tttccaatta 


tcctgtcggc 


2820 


tattatctcc 


gcctcagttt 


gatcttccgc 


ttcagactgc 


catttttcac 


ataatgaatc 


2880 


tatttcaccc 


cacaatcctt 


catccgcctc 


cgcatcttgt 


tccgttaaac 


tattgacttc 


2940 


atgttgtaca 


ttgtttagtt 


cacgagaagg 


gtcctcttca 


ggcggtagct 


cctgatctcc 


3000 


tatatgacct 


ttatcctgtt 


ctctttccac 


aaacttagaa 


atgtattcat 


gaattatgga 


3060 


gcacctaata 


acattcttca 


ciggcggagaa 


gtttgggcca 


gatgcccaat 


atgcttgaca 


3120 


tgaaaacgtg 


agaatgaatt 


tagtattatt 


gtgatattct 


gaggcaattt 


tattataatc 


3180 


tcgaagataa 


gagaagaatg 


cagtgacctt 


tgtattgaca 


aatggagatt 


ccatgtatct 


3240 


aaaaaatacg 


cctttaggcc 


ttctgatacc 


ctttcccctg 


cggtttagcg 


tgccttttac 


3300 


attaatatct 


aaaccctctc 


cgatggtggc 


ctttaactga 


ctaataaatg 


caaccgatat 


3360 


aaactgtgat 


aattctgggt 


gatttatgat 


tcgatcgaca 


attgtattgt 


acactagtgc 


3420 


aggatcaggc 


caatccagtt 


ctttttcaat 


taccggtgtg 


tcgtctgtat 


tcagtacatg 


3480 


tccaacaaat 


gcaaatgcta 


acgttttgta 


tttcttataa 


ttgtcaggaa 


ctggaaaagt 


3540 
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cccccttgtc 


gtctcgatta 


cacacctact 


ttcatcgtac 


accataggtt 


ggaagtgctg 


3600 


cataatacat 


tgcttaatac 


aagcaagcag 


tctctcgcca 


ttcatatttc 


agttattttc 


3660 


cattacagct 


gatgtcattg 


tatatcagcg 


ctgtaaaaat 


ctatctgtta 


cagaaggttt 


3720 


tcgcggtttt 


tataaacaaa 


actttcgtta 


cgaaatcgag 


caatcacccc 


agctgcgtat 


3780 


ttggaaattc 


gggaaaaagt 


agagcaacgc 


gagttgcatt 


ttttacacca 


taatgcatga 


3840 


ttaacttcga 


gaagggatta 


aggctaattt 


cactagtat.g 


tttcaaaaac 


ctcaatctgt 


3900 


ccattgaatg 


ccttataaaa 


cagctataga 


ttgcatagaa 


gagttagcta 


ctcaatgctt 


3960 


tttgtcaaag 


cttactgatg 


atgatgtgtc 


tactttcagg 


cgggtctgta 


gtaaggagaa 


4020 


tgacattata 


aagctggcac 


ttagaattcc 


acggactata 


gactatacta 


gtatactccg 


4080 


tctactgtac 


gatacacttc 


cgctcaggtc 


cttgtccttt 


aacgaggcct 


taccactctt 


4140 


ttgttactct 


attgatccag 


ctcagcaaag 


gcagtgtgat 


ctaagattct 


atcttcgcga 


4200 


tgtagtaaaa 


ctagctagac 


cgagaaagag 


actagaaatg 


caaaaggcac 


ttctacaatg 


4260 


gctgccatca 


ttattatccg 


atgtgacgct 


gcattttttt 


tttttttttt 


tttttttttt 


4320 


tttttttttt 


tttttttttt 


ttttttggta 


caaatatcat 


aaaaaaagag 


aatcttttta 


4380 


agcaaggatt 


ttcttaactt 


cttcggcgac 


agcatcaccg 


acttcggtgg 


tactgttgga 


4440 


accacctaaa 


tcaccagttc 


tgatacctgc 


atccaaaacc 


tttttaactg 


catcttcaat 


4500 


ggctttacct 


tcttcaggca 


agttcaatga 


caatttcaac 


atcattgcag 


cagacaagat 


4560 


agtggcgata 


gggttgacct 


tattctttgg 


caaatctgga 


gcggaaccat 


ggcatggttc 


4620 


gtacaaacca 


aatgcggtgt 


tcttgtctgg 


caaagaggcc 


aaggacgcag 


atggcaacaa 


4680 


acccaaggag 


cctgggataa 


cggaggcttc 


atcggagatg 


atatcaccaa 


acatgttgct 


4740 


ggtgattata 


ataccattta 


ggtgggttgg 


gttcttaact 


aggatcatgg 


cggcagaatc 


4800 


aatcaattga 


tgttgaactt 


tcaatgtagg 


gaattcgttc 


ttgatggttt 


cctccacagt 


4860 


ttttctccat 


aatcttgaag 


aggccaaaac 


attagcttta 


tccaaggacc 


aaataggcaa 


4920 


tggtggctca 


tgttgtaggg 


ccatgaaagc 


ggccattctt 


gtgattcttt 


gcacttctgg 


4980 


aacggtgtat 


tgttcactat 


cccaagcgac 


accatcacca 


tcgtcttcct 


ttctcttacc 


5040 


aaagtaaata 


cctcccacta 


attctctaac 


aacaacgaag 


tcagtacctt 


tagcaaattg 


5100 


tggcttgatt 


ggagataagt 


ctaaaagaga 


gtcggatgca 


aagttacatg 


gtcttaagtt 


5160 


ggcgtacaat 


tgaagttctt 


tacggatttt 


tagtaaacct 


tgttcaggtc 


taacactacc 


5220 


ggtaccccat 


ttaggaccac 


ccacagcacc 


taacaaaacg 


gcatcagcct 


tcttggaggc 


5280 


ttccagcgcc 


tcatctggaa 


gtggaacacc 


tgtagcatcg 


atagcagcac 


caccaattaa 


5340 
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atgattttcg 


aaatcgaact 


tgacattgga 


acgaacatca 


gaaatagctt 


taagaacctt 


5400 


aatggcttcg 


gctgtgattt 


cttgaccaac 


gtggtcacct 


ggcaaaacga 


cgatcttctt 


5460 


aggggcagac 


attacaatgg 


tatatccttg 


aaatatatat 


aaaaaaaaaa 


aaaaaaaaaa 


5520 


aaaaaaaaaa 


atgcagcttc 


tcaatgatat 


tcgaatacgc 


tttgaggaga 


tacagcctaa 


5580 


tatccgacaa 


actgttttac 


agatttacga 


tcgtacttgt 


tacccatcat 


tgaattttga 


5640 


acatccgaac 


ctgggagttt 


tccctgaaac 


agatagtata 


tttgaacctg 


tataataata 


5700 


tatagtctag 


cgctttacgg 


aagacaatgt 


atgtatttcg 


gttcctggag 


aaactattgc 


5760 


- atctattgca 


taggtaatct 


tgcacgtcgc 


atccccggtt 


cattttctgc 


gtttccatct 


5820 


tgcacttcaa 


tagcatatct 


ttgttaacga 


agcatctgtg 


cttcattttg 


tagaacaaaa 


5880 


atgcaacgcg 


agagcgctaa 


tttttcaaac 


aaagaatctg 


agctgcattt 


ttacagaaca 


5940 


gaaatgcaac 


gcgaaagcgc 


tattttacca 


acgaagaatc 


tgtgcttcat 


ttttgtaaaa 


6000 


caaaaatgca 


acgcgagagc 


gctaattttt 


caaacaaaga 


atctgagctg 


catttttaca 


6060 


gaacagaaat 


gcaacgcgag 


agcgctattt 


taccaacaaa 


gaatctatac 


ttcttttttg 


6120 


ttctacaaaa 


atgcatcccg 


agagcgctat 


ttttctaaca 


aagcatctta 


gattactttt 


6180 


tttctccttt 


gtgcgctcta 


taatgcagtc 


tcttgataac 


tttttgcact 


gtaggtccgt 


6240 


taaggttaga 


agaaggctac 


tttggtgtct 


attttctctt 


ccataaaaaa 


agcctgactc 


6300 


cacttcccgc 


gtttactgat 


tactagcgaa 


gctgcgggtg 


cattttttca 


agataaaggc 


6360 


atccccgatt 


atattctata 


ccgatgtgga 


ttgcgcatac 


tttgtgaaca 


gaaagtgata 


6420 


gcgttgatga 


ttcttcattg 


gtcagaaaat 


tatgaacggt 


ttcttctatt 


ttgtctctat 


6480 


atactacgta 


taggaaatgt 


ttacattttc 


gtattgtttt 


cgattcactc 


tatgaatagt 


6540 


tcttactaca 


atttttttgt 


ctaaagagta 


atactagaga 


taaacataaa 


aaatgtagag 


6600 


gtcgagttta 


gatgcaagtt 


caaggagcga 


aaggtggatg 


ggtaggttat 


atagggatat 


6660 


agcacagaga 


tatatagcaa 


agagatactt 


ttgagcaatg 


tttgtggaag 


cggtattcgc 


6720 


aatattttag 


tagctcgtta 


cagtccggtg 


cgtttttggt 


tttttgaaag 


tgcgtcttca 


6780 


gagcgctttt 


ggttttcaaa 


agcgctctga 


agttcctata 


ctttctagag 


aataggaact 


6840 


tcggaatagg 


aacttcaaag 


cgtttccgaa 


aacgagcgct 


tccgaaaatg 


caacgcgagc 


6900 


tgcgcacata 


cagctcactg 


ttcacgtcgc 


acctatatct 


gcgtgttgcc 


tgtatatata 


6960 


tatacatgag 


aagaacggca 


tagtgcgtgt 


ttatgcttaa 


atgcgtactt 


atatgcgtct 


7020 


atttatgtag 


gatgaaaggt 


agtctagtac 


ctcctgtgat 


attatcccat 


tccatgcggg 


7080 


gtatcgtatg 


cttccttcag 


cactaccctt 


tagctgttct 


atatgctgcc 


actcctcaat 


7140 
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tggattagtc 


tcatccttca 


acgctatcat 


ttcctttgat 


attggatcat 


atgcatagta 


7200 


ccgagaaact 


agtgcgaagt 


agtgatcagg 


tattgctgtt 


atctgatgag 


tatacgttgt 


7260 


cctggccacg 


gcagaagcac 


gcttatcgct 


ccaatttccc 


acaacattag 


tcaactccgt 


7320 


taggcccttc 


attgaaagaa 


atgaggtcat 


caaatgtctt 


ccaatgtgag 


attttgggcc 


7380 


attttttata 


gcaaagattg 


aataaggcgc 


atttttcttc 


aaagctttat 


tgtacgatct 


7440 


gactaagtta 


tcttttaata 


attggtattc 


ctgtttattg 


cttgaagaat 


tgccggtcct 


7500 


atttactcgt 


tttaggactg 


gttcagaatt 


cctcaaaaat 


tcatccaaat 


atacaagtgg 


7560 


atcgatgata 


agctgtcaaa 


catgagaatt 


cttgaagacg 


aaagggcctc 


gtgatacgcc 


7620 


tatttttata 


ggttaatgtc 


atgataataa 


tggtttctta 


gacgtcaggt 


ggcacttttc 


7680 


ggggaaatgt 


gcgcggaacc 


cctatttgtt 


tatttttcta 


aatacattca 


aatatgtatc 


7740 


cgctcatgag 


acaataaccc 


tgataaatgc 


ttcaataata 


ttgaaaaagg 


aagagtatga 


7800 


gtattcaaca 


tttccgtgtc 


gcccttattc 


ccttttttgc 


ggcattttgc 


cttcctgttt 


7860 


ttgctcaccc 


agaaacgctg 


gtgaaagtaa 


aagatgctga 


agatcagttg 


ggtgcacgag 


7920 


tgggttacat 


cgaactggat: 


ctcaacagcg 


gtaagatcct 


tgagagtttt 


cgccccgaag 


7980 


aacgttttcc 


aatgatgagc 


acttttaaag 


ttctgctatg 


tggcgcggta 


ttatcccgtg 


8040 


ttgacgccgg 


gcaagagcaa 


ctcggtcgcc 


gcatacacta 


ttctcagaat 


gacttggttg 


8100 


agtactcacc 


agtcacagaa 


aagcatctta 


cggatggcat 


gacagtaaga 


gaattatgca 


8160 


gtgctgccat 


aaccatgagt 


gataacactg 


cggccaactt 


acttctgaca 


acgatcggag 


8220 


gaccgaagga 


gctaaccgct 


tttttgcaca 


acatggggga 


tcatgtaact 


cgccttgatc 


8280 


gttgggaacc 


ggagctgaat 


gaagccatac 


caaacgacga 


gcgtgacacc 


acgatgcctg 


8340 


cagcaatggc 


aacaacgttg 


cgcaaactat 


taactggcga 


actacttact 


ctagcttccc 


8400 


ggcaacaatt 


aatagactgg 


atggaggcgg 


ataaagttgc 


aggaccactt 


ctgcgctcgg 


8460 


cccttccggc 


tggctggttt 


attgctgata 


aatctggagc 


cggtgagcgt 


gggtctcgcg 


8520 


gtatcattgc 


agcactgggg 


ccagatggta 


agccctcccg 


tatcgtagtt 


atctacacga 


8580 


cggggagtca 


ggcaactatg 


gatgaacgaa 


atagacagat 


cgctgagata 


ggtgcctcac 


8640 


tgattaagca 


ttggtaactg 


tcagaccaag 


tttactcata 


tatactttag 


attgatttaa 


8700 


aacttcattt 


ttaatttaaa 


aggatctagg 


tgaagatcct 


ttttgataat 


ctcatgacca 


8760 


aaatccctta 


acgtgagttt 


tcgttccact 


gagcgtcaga 


ccccgtagaa 


aagatcaaag 


8820 


gatcttcttg 


agatcctttt 


tttctgcgcg 


taatctgctg 


cttgcaaaca 


aaaaaaccac 


88B0 


cgctaccagc 


ggtggtttgt 


ttgccggatc 


aagagctacc 


aactcttttt 


ccgaaggtaa 


8940 



69 



wo 01/38360 



PCT/USOO/32326 



ctggcttcag 


cagagcgcag 


ataccaaata 


ctgtccttct 


agtgtagccg 


tagttaggcc 


9000 


accacttcaa 


gaactctgta 


gcaccgccta 


catacctcgc 


tctgctaatc 


ctgttaccag 


9060 


tggctgctgc 


cagtggcgat 


aagtcgtgtc 


ttaccgggtt 


ggactcaaga 


cgatagttac 


9120 


cggataaggc 


gcagcggtcg 


ggctgaacgg 


ggggttcgtg 


cacacagccc 


agcttggagc 


9180 


gaacgaccta 


caccgaactg 


agatacctac 


agcgtgagct 


atgagaaagc 


gccacgcttc 


9240 


ccgaagggag 


aaaggcggac 


aggtatccgg 


taagcggcag 


ggtcggaaca 


ggagagcgca 


9300 


cgagggagct 


tccaggggga 


aacgcctggt 


atctttatag 


tcctgtcggg 


tttcgccacc 


9360 


tctgacttga 


gcgtcgattt 


ttgtgatgct 


cgtcaggggg 


gcggagccta 


tggaaaaacg 


9420 


ccagcaacgc 


ggccttttta 


cggttcctgg 


ccttttgctg 


gccttttgct 


cacatgttct 


9480 


ttcctgcgtt 


atcccctgat 


tctgtggata 


accgtattac 


cgcctttgag 


tgagctgata 


9540 


ccgctcgccg 


cagccgaacg 


accgagcgca 


gcgagtcagt 


gagcgaggaa 


gcggaagagc 


9600 


gcctgatgcg 


gtattttctc 


cttacgcatc 


tgtgcggtat 


ttcacaccgc 


atatggtgca 


9660 


ctctcagtac 


aatctgctct 


gatgccgcat 


agttaagcca 


gtatacactc 


cgctatcgct 


9720 


acgtgactgg 


gtcatggctg 


cgccccgaca 


cccgccaaca 


cccgctgacg 


cgccctgacg 


9780 


ggcttgtctg 


ctcccggcat 


ccgcttacag 


acaagctgtg 


accgtctccg 


ggagctgcat 


9840 


gtgtcagagg 


ttttcaccgt 


catcaccgaa 


acgcgcgagg 


cagctgcggt 


aaagctcatc 


9900 


agcgtggtcg 


tgaagcgatt 


cacagatgtc 


tgcctgttca 


tccgcgtcca 


gctcgttgag 


9960 


tttctccaga 


ci9cgttaatg 


tctggcttct 


gataaagcgg 


gccatgttaa 


gggcggtttt 


10020 


ttcctgtttg 


gtcactgatg 


cctccgtgta 


agggggattt 


ctgttcatgg 


gggtaatgat 


10080 


accgatgaaa 


cgagagagga 


tgctcacgat 


acgggttact 


gatgatgaac 


atgcccggtt 


10140 


actggaacgt 


tgtgagggta 


aacaactggc 


ggtatggatg 


cggcgggacc 


agagaaaaat 


10200 


cactcagggt 


caatgccagc 


gcttcgttaa 


tacagatgta 


ggtgttccac 


agggtagcca 


10260 


gcagcatcct 


gcgatgcaga 


tccggaacat 


aatggtgcag 


ggcgctgact 


tccgcgtttc 


10320 


cagactttac 


gaaacacgga 


aaccgaagac 


cattcatgtt 


gttgctcagg 


tcgcagacgt 


10380 


tttgcagcag 


cagtcgcttc 


acgttcgctc 


gcgtatcggt 


gattcattct 


gctaaccagt 


10440 


aaggcaaccc 


cgccagccta 


gccgggtcct 


caacgacagg 


agcacgatca 


tgcgcacccg 


10500 


tggccaggac 


ccaacgctgc 


ccgagatgcg 


ccgcgtgcgg 


ctgctggaga 


tggcggacgc 


10560 


gatggatatg 


ttctgccaag 


ggttggtttg 


cgcattcaca 


gttctccgca 


agaattgatt 


10620 


ggctccaatt 


cttggagtgg 


tgaatccgtt 


agcgaggtgc 


cgccggcttc 


cattcaggtc 


10680 


gaggtggccc 


ggctccatgc 


accgcgacgc 


aacgcgggga 


ggcagacaag 


gtatagggcg 


10740 
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gcgcctacaa 


tccatgccaa 


cccgttccat 


gtgctcgccg 


aggcggcata 


aatcgccgtg 


10800 


acgatcagcg 


gtccaatgat 


cgaagttagg 


ctggtaagag 


ccgcgagcga 


tccttgaagc 


10860 


tgtccctgat 


ggtcgtcatc 


tacctgcctg 


gacagcatgg 


cctgcaacgc 


gggcatcccg 


10920 


atgccgccgg 


aagcgagaag 


aatcataatg 


gggaaggcca 


tccagcctcg 


cgtcgcgaac 


10980 


gccagcaaga 


cgtagcccag 


cgcgtcggcc 


gccatgccgg 


cgataatggc 


ctgcttctcg 


11040 


ccgaaacgtt 


tggtggcggg 


accagtgacg 


aaggcttgag 


cgagggcgtg 


caagattccg 


11100 


aataccgcaa 


gcgacaggcc 


gatcatcgtc 


gcgctccagc 


gaaagcggtc 


ctcgccgaaa 


11160 


atgacccaga 


gcgctgccgg 


cacctgtcct 


acgagttgca 


tgataaagaa 


gacagtcata 


11220 


agtgcggcga 


cgatagtcat 


gccccgcgcc 


caccggaagg 


agctgactgg 


gttgaaggct 


11280 


ctcaagggca 


tcggtcgagg 


atccttcaat 


atgcgcacat 


acgctgttat 


gttcaaggtc 


11340 


ccttcgttta 


agaacgaaag 


cggtcttcct 


tttgagggat 


gtttcaagtt 


gttcaaatct 


11400 


atcaaatttg 


caaatcccca 


gtctgtatct 


agagcgttga 


atcggtgatg 


cgatttgtta 


11460 


attaaattga 


tggtgtcacc 


attaccaggt 


ctagatatac 


caatggcaaa 


ctgagcacaa 


11520 


caataccagt 


ccggatcaac 


tggcaccatc 


tctcccgtag 


tctcatctaa 


tttttcttcc 


11580 


ggatgaggtt 


ccagatatac 


cgcaacacct 


ttattatggt 


ttccctgagg 


gaataataga 


11640 


atgtcccatt 


cgaaatcacc 


aattctaaac 


ctgggcgaat 


tgtatttcgg 


gtttgttaac 


11700 


tcgttccagt 


caggaatgtt 


ccacgtgaag 


ctatcttcca 


gcaaagtctc 


cacttcttca 


11760 


tcaaattgtg 


gagaatactc 


ccaatgctct 


tatctatggg 


acttccggga 


aacacagtac 


11820 


cgatacttcc 


caattcgtct 


tcagagctca 


ttgtttgttt 


gaagagacta 


atcaaagaat 


11880 


cgttttctca 


aaaaaattaa 


tatcttaact 


gatagtttga 


tcaaaggggc 


aaaacgtagg 


11940 


ggcaaacaaa 


cggaaaaatc 


gtttctcaaa 


ttttctgatg 


ccaagaactc 


taaccagtct 


12000 


tatctaaaaa 


ttgccttatg 


atccgtctct 


ccggttacag 


cctgtgtaac 


tgattaatcc 


12060 


tgcctttcta 


atcaccattc 


taatgtttta 


attaagggat 


tttgtcttca 


ttaacggctt 


12120 


tcgctcataa 


aaatgttatg 


acgttttgcc 


cgcaggcggg 


aaaccatcca 


cttcacgaga 


12180 


ctgatctcct 


ctgccggaac 


accgggcatc 


tccaacttat 


aagttggaga 


aataagagaa 


12240 


tttcagattg 


agagaatgaa 


aaaaaaaaac 


ccttagttca 


taggtccatt 


ctcttagcgc 


12300 


aactacagag 


aacaggggca 


caaacaggca 


aaaaacgggc 


acaacctcaa 


tggagtgatg 


12360 


caacctgcct 


ggagtaaatg 


atgacacaag 


gcaattgacc 


cacgcatgta 


tctatctcat 


12420 


tttcttacac 


cttctattac 


cttctgctct 


ctctgatttg 


gaaaaagctg 


aaaaaaaagg 


12480 


ttgaaaccag 


ttccctgaaa 


ttattcccct 


acttgactaa 


taagtatata 


aagacggtag 


12540 
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gtattgattg taattctgta aatctatttc ttaaacttct taaattctac ttttatagtt 12600 

agtctttttt ttagttttaa aacaccaaga acttagtttc gaataaacac acataaacaa 12660 

acaagcttac aaaacaaa atg get gca tat gca get cag ggc tat aag gtg 12711 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
15 10 

eta gta etc aac cec tct gtt get gea aca ctg gge ttt ggt get tac 12759 
Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
15 20 25 

atg tec aag get cat ggg ate gat cct aac ate agg acc ggg gtg aga 12807 
Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg 
30 35 40 

aea att ace aet gge age cec ate aeg tac tec ace tac ggc aag ttc 12855 
Thr lie Thr Thr Gly Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe 
45 50 55 

ett gcc gae ggc ggg tgc teg ggg ggc get tat gae ata ata att tgt 12903 
Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He lie Cys 
60 65 70 75 

gae gag tgc cac tec aeg gat gee aca tec ate ttg gge att ggc aet 12951 
Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu Gly He Gly Thr 
80 85 90 

gte ett gae caa gca gag aet gcg ggg gcg aga ctg gtt gtg etc gee 12999 
Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
95 100 105 

acc gcc acc cct ccg ggc tec gte aet gtg cec eat cec aac ate gag 13047 
Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu 
110 115 120 

gag gtt get ctg tec acc acc gga gag ate cct ttt tac ggc aag get 13095 
Glu Val Ala Leu Ser Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala 
125 130 135 

ate cec etc gaa gta ate aag ggg ggg aga cat etc ate ttc tgt eat 13143 
He Pro Leu Glu Val He Lys Gly Gly Arg His Leu He Phe Cys His 
140 145 150 155 

tea aag aag aag tgc gae gaa etc gee gca aag ctg gte gea ttg ggc 13191 
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly 
160 165 170 

ate aat gcc gtg gcc tac tac cgc ggt ett gae gtg tec gte ate ccg 13239 
He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
175 180 185 

ace age ggc gat gtt gte gte gtg gca acc gat gee etc atg acc ggc 13287 
Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
190 195 200 

tat ace gge gae ttc gae teg gtg ata gae tgc aat aeg tgt gte acc 13335 
Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
205 210 215 
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cag aca gtc gat ttc age ctt gac cct acc ttc acc att gag aca ate 13383 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr lie Glu Thr lie 
220 225 230 235 

acg etc cce caa gat get gtc tec cge act caa cgt egg ggc agg act 13431 
Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr 
240 245 250 

ggc agg ggg aag cca ggc ate tac aga ttt gtg gca ccg ggg gag cge 13479 
Gly Arg Gly Lys Pro Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg 
255 260 265 

cce tec ggc atg ttc gac teg tec gtc etc tgt gag tgc tat gac gca 13527 
Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
270 275 280 

ggc tgt get tgg tat gag etc acg cce gee gag act aca gtt agg eta 13575 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu 
285 290 295 

cga geg tac atg aac ace ccg ggg ctt cce gtg tgc cag gac cat ctt 13623 
Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
300 305 310 315 

gaa ttt tgg gag ggc gtc ttt aca ggc etc act cat ata gat gee cac 13671 
Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His lie Asp Ala His 
320 325 330 

ttt eta tec cag aca aag cag agt ggg gag aac ctt cct tac ctg gta 13719 
Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val 
335 340 345 

geg tac caa gee acc gtg tgc get agg get caa gee cct cce cca teg 13767 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
350 355 360 

tgg gac cag atg tgg aag tgt ttg att cge etc aag cce ace etc cat 13615 
Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu His 
365 370 375 

ggg cca aca cec ctg eta tac aga ctg ggc get gtt cag aat gaa ate 13863 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu He 
380 385 390 395 

acc ctg acg cac cca gtc ace aaa tac ate atg aca tgc atg teg gee 13911 
Thr Leu Thr His Pro Val Thr Lys Tyr He Met Thr Cys Met Ser Ala 
400 405 410 

gac ctg gag gtc gtc acg age acc tgg gtg etc gtt ggc ggc gtc ctg 13959 
Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu 
415 420 425 

get get ttg gee geg tat tgc ctg tea aca ggc tgc gtg gtc ata gtg 14007 
Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val He Val 
430 435 440 

ggc agg gtc gtc ttg tee ggg aag ccg gca ate ata cct gac agg gaa 14055 
Gly Arg Val Val Leu Ser Gly Lys Pro Ala He He Pro Asp Arg Glu 
445 450 455 
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gtc etc tac cga gag ttc gat gag atg gaa gag tgc tct cag cac tta 14103 
Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu 
460 465 470 475 

ccg tac ate gag caa ggg atg atg etc gee gag eag ttc aag cag aag 14151 
Pro Tyr He Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys 
480 485 490 

gcc etc ggc etc ctg cag acc gcg tec cgt cag gca gag gtt ate gee 14199 
Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala 
495 500 505 

cct get gtc cag acc aac tgg caa aaa etc gag acc ttc tgg gcg aag 14247 
Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys 
510 515 520 

cat atg tgg aac ttc ate agt ggg ata caa tac ttg gcg ggc ttg tea 14295 
His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser 
525 530 535 

acg ctg cct ggt aac ccc gcc att get tea ttg atg get ttt aca get 14343 
Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala 
540 545 550 555 

get gtc acc age cca eta acc act age caa acc etc etc ttc aac ata 14391 
Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He 
560 565 570 

ttg ggg ggg tgg gtg get gcc cag etc gcc gcc ccc ggt gee get act 14439 
Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr 
575 580 585 

gee ttt gtg ggc get ggc tta get ggc gee gcc ate ggc agt gtt gga 14487 
Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly 
590 595 600 

ctg ggg aag gtc etc ata gae ate ett gca ggg tat ggc gcg ggc gtg 14535 
Leu Gly Lys Val Leu He Asp He Leu Ala Gly Tyr Gly Ala Gly Val 
605 610 615 

gcg gga get ett gtg gca ttc aag ate atg age ggt gag gtc ccc tec 14583 
Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu Val Pro Ser 
620 625 630 635 

acg gag gae ctg gtc aat eta ctg ccc gee ate etc teg ccc gga gcc 14631 
Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala 
640 645 650 

etc gta gtc ggc gtg gtc tgt gca gca ata ctg cgc egg cac gtt ggc 14679 
Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His Val Gly 
655 660 665 

ccg ggc gag ggg gca gtg cag tgg atg aac egg ctg ata gcc ttc gcc 14727 
Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala Phe Ala 
670 675 680 

tec egg ggg aac eat gtt tec ccc acg cac tac gtg ccg gag age gat 14775 
Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp 
685 690 695 
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gca get gcc cgc gtc act gcc ata etc age age etc act gta ace cag 14823 
Ala Ala Ala Arg Val Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin 
700 705 710 715 

etc ctg agg ega ctg eac cag tgg ata age teg gag tgt ace act cea 14871 
Leu Leu Arg Arg Leu His Gin Trp He Ser Ser Glu Cys Thr Thr Pro 
720 725 730 

tgc tec ggt tec tgg eta agg gac ate tgg gae tgg ata tge gag gtg 14919 
Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp He Cys Glu Val 
735 740 745 

ttg age gac ttt aag ace tgg eta aaa get aag etc atg cea cag ctg 14967 
Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu 
750 755 760 

cct ggg ate ccc ttt gtg tec tgc cag cgc ggg tat aag ggg gtc tgg 15015 
Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp 

765 770 775 

ega ggg gac ggc ate atg cac act ege tgc cac tgt gga get gag ate 15063 
Arg Gly Asp Gly He Met His Thr Arg Cys His Cys Gly Ala Glu He 
780 785 790 795 

act gga cat gtc aaa aac ggg acg atg agg ate gtc ggt cct agg acc 15111 
Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly Pro Arg Thr 
800 805 810 

tgc agg aac atg tgg agt ggg acc ttc ccc att aat gcc tac ace acg 15159 
Cys Arg Asn Met Trp Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr 
815 820 825 

ggc ccc tgt acc ccc ett cct gcg ccg aac tac acg. ttc gcg eta tgg 15207 
Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp 
830 835 840 

agg gtg tct gca gag gaa tac gtg gag ata agg cag gtg ggg gac ttc 15255 
Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe 
845 850 855 

eac tac gtg acg ggt atg act act gac aat ctt aaa tgc ccg tgc cag 15303 
His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin 
860 865 870 875 

gtc cea teg ccc gaa ttt ttc aca gaa ttg gae ggg gtg cgc eta cat 15351 
Val Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His 
880 885 890 

agg ttt gcg ccc ccc tgc aag ccc ttg ctg egg gag gag gta tea ttc 15399 
Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe 
895 900 905 

aga gta gga etc cac gaa tac ccg gta ggg teg caa tta cct tgc gag 15447 
Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu 
910 915 920 

ccc gaa ccg gac gtg gee gtg ttg acg tec atg etc act gat ccc tec 15495 
Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser 
925 930 935 
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cat ata aca gca gag gcg gcc ggg cga agg ttg gcg agg gga tea ccc 15543 
His He Thr Ala Glu Ala Ala 61y Arg Arg Leu Ala Arg Gly Ser Pro 
940 945 950 955 

ccc tct gtg gcc age tec teg get age eag eta tee get eca tet etc 15591 
Pro Ser Val Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu 
960 965 970 

aag gca act tge ace get aac cat gac tee cet gat get gag etc ata 15639 
Lys Ala Thr Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu He 
975 980 985 

gag gcc aac etc eta tgg agg cag gag atg ggc ggc aac ate acc agg 15687 
Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He Thr Arg 
990 995 1000 

gtt gag tea gaa aac aaa gtg gtg att ctg gac tec ttc gat ccg ctt 15735 
Val Glu Ser Glu Asn Lys Val Val He Leu Asp Ser Phe Asp Pro Leu 
1005 1010 1015 

gtg gcg gag gag gac gag egg gag ate tec gta ccc gea gaa ate etg 15783 
Val Ala Glu Glu Asp Glu Arg Glu He Ser Val Pro Ala Glu He Leu 
1020 1025 1030 1035 

egg aag tet egg aga ttc gcc cag gcc ctg ccc gtt tgg gcg egg ccg 15831 
Arg Lys Ser Arg Arg Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro 
1040 1045 1050 

gac tat aac ccc ccg eta gtg gag acg tgg aaa aag ccc gac tac gaa 15879 
Asp Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu 
1055 1060 1065 

eca cet gtg gte eat ggc tge ccg ctt eca cet eca aag tec cet cet 15927 
Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro 
1070 1075 1080 

gtg cet ccg cet egg aag aag egg acg gtg gtc etc act gaa tea ace 15975 
Val Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr 
1085 1090 1095 

eta tet act gee ttg gee gag etc gee ace aga age ttt ggc age tec 16023 
Leu Ser Thr Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser 
1100 1105 1110 1115 

tea act tec ggc att acg ggc gac aat acg aca aca tec tct gag ccc 16071 
Ser Thr Ser Gly He Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro 
1120 1125 1130 

gee cet tet ggc tge ccc ccc gac tec gac get gag tec tat tec tec 16119 
Ala Pro Ser Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser 
1135 1140 1145 

atg ccc ccc ctg gag ggg gag ect ggg gat ccg gat ctt age gac ggg 16167 
Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly 
1150 1155 1160 

tea tgg tea acg gtc agt agt gag gcc aac gcg gag gat gtc gtg tge 16215 
Ser Trp Ser Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys 
1165 1170 1175 
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tgc tea atg tct tac tct tgg aca ggc gca etc gtc acc ceg tgc gee 16263 
Cys Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala 
1180 1185 1190 1195 

gcg gaa gaa cag aaa ctg ccc ate aat gca eta age aac teg ttg eta 16311 
Ala Glu Glu Gin Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu 
1200 1205 1210 

cgt cac cac aat ttg gtg tat tec acc acc tea cgc agt get tgc caa 16359 
Arg His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin 
1215 1220 1225 

agg cag aag aaa gtc aca ttt gac aga ctg caa gtt ctg gac age cat 16407 
Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His 
1230 1235 1240 

tac cag gac gta etc aag gag gtt aaa gca gcg gcg tea aaa gtg aag 16455 
Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys 
1245 1250 1255 

get aac ttg eta tec gta gag gaa get tgc age ctg aeg ccc eca cac 16503 
Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His 
1260 1265 1270 1275 

tea gee aaa tec aag ttt ggt tat ggg gca aaa gac gtc cgt tgc cat 16551 
Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His 
1280 1285 1290 

gee aga aag gee gta ace cac ate aac tec gtg tgg aaa gac ett ctg 16599 
Ala Arg Lys Ala val Thr His lie Asn Ser Val Trp Lys Asp Leu Leu 
1295 1300 1305 

gaa gac aat gta aca eca ata gac act acc ate atg get aag aac gag 16647 
Glu Asp Asn Val Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu 
1310 1315 1320 

gtt ttc tgc gtt cag cct gag aag ggg ggt cgt aag eca get cgt etc 16695 
Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu 
1325 1330 1335 

ate gtg ttc ccc gat ctg ggc gtg cgc gtg tgc gaa aag atg get ttg 16743 
He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu 
1340 1345 1350 1355 

tac gac gtg gtt aca aag etc ccc ttg gee gtg atg gga age tec tac 16791 
Tyr Asp Val Val Thr Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr 
1360 1365 1370 

gga ttc caa tac tea eca gga cag egg gtt gaa ttc etc gtg caa gcg 16839 
Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala 
1375 1380 1385 

tgg aag tec aag aaa acc eca atg ggg ttc teg tat gat acc cgc tgc 16887 
Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys 
1390 1395 1400 

ttt gac tec aca gtc act gag age gac ate cgt aeg gag gag gca ate 16935 
Phe Asp Ser Thr Val Thr Glu Ser Asp He Arg Thr Glu Glu Ala He 
1405 1410 1415 
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tac caa tgt tgt gac etc gac ccc caa gcc cgc gtg gcc ate aag tec 16983 
Tyr Gin Cys Cys Asp Leu Asp Pro Glxi Ala Arg Val Ala lie Lys Ser 
1420 1425 1430 1435 

etc ace gag agg ctt tat gtt ggg ggc cct ctt ace aat tea agg ggg 17031 
Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly 
1440 1445 1450 

gag aac tgc ggc tat cgc agg tgc cgc gcg age ggc gta ctg aca act 17079 
Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr 
1455 1460 1465 

age tgt ggt aac ace etc act tgc tac ate aag gcc egg gea gcc tgt 17127 
Ser Cys Gly Asn Thr Leu Thr Cys Tyr He Lys Ala Arg Ala Ala Cys 
1470 1475 1480 

ega gcc gea ggg etc cag gac tgc acc atg etc gtg tgt ggc gac gac . 17175 
Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp 
1485 , 1490 1495 

tta gtc gtt ate tgt gaa age gcg ggg gtc cag gag gac gcg gcg age 17223 
Leu Val Val He Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser 
1500 1505 1510 1515 

ctg aga gcc ttc acg gag get atg acc agg tac tec gcc ccc cct ggg 17271 
Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly 
1520 1525 1530 

gac ccc cea caa cca gaa tac gac ttg gag etc ata aca tea tgc tee 17319 
Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser Cys Ser 
1535 1540 1545 

tec aac gtg tea gtc gcc eac gac ggc get gga aag agg gtc tac tac 17367 
Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr 
1550 1555 1560 

etc acc cgt gac cct aca acc ccc etc gcg aga get gcg tgg gag aca 17415 
Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr 
1565 1570 1575 

gea aga eac act cca gtc aat tec tgg eta ggc aac ata ate atg ttt 17463 
Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He He Met Phe 
1580 1585 1590 1595 

gcc CCC aca ctg tgg gcg agg atg ata ctg atg acc cat ttc ttt age 17511 
Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His Phe Phe Ser 
1600 1605 1610 

gtc ctt ata gcc agg gac cag ctt gaa cag gcc etc gat tgc gag ate 17559 
Val Leu He Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He 
1615 1620 1625 

tac ggg gee tgc tac tec ata gaa cca ctg gat eta cct cca ate att 17607 
Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Pro He He 
1630 1635 1640 

caa aga etc cat ggc etc age gea ttt tea etc eac agt tac tet cca 17655 
Gin Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro 
1645 1650 1655 
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ggt gaa ate aat agg gtg gcc gca tgc etc aga aaa ctt ggg gta cog 17703 
Gly Glu lie Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro 
1660 1665 1670 1675 

ccc ttg cga get tgg aga cac egg gcc egg age gte egc get agg ett 17751 
Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu 
1680 1665 1690 

ctg gcc aga gga ggc agg get gcc ata tgt ggc aag tac etc ttc aac 17799 
Leu Ala Arg Gly Gly Arg Ala Ala lie Cys Gly Lys Tyr Leu Phe Asn 
1695 1700 1705 

tgg gca gta aga aca aag etc aaa etc act eca ata geg gcc get ggc 17847 
Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro lie Ala Ala Ala Gly 
1710 1715 1720 

eag ctg gae ttg tee ggc tgg ttc aeg get ggc tac age ggg gga gae 17895 
Gin Leu Asp Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp 
1725 1730 1735 

att tat cae. age gtg tct cat gee egg ccc cgc tgg ate tgg ttt tgc 17943 
He Tyr His Ser Val Ser His Ala Arg Pro Arg Trp He Trp Phe Cys 
1740 1745 1750 1755 

eta etc ctg ctt get gca ggg gta ggc ate tac etc etc ccc aac cga 17991 
Leu Leu Leu Leu Ala Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg 
1760 1765 1770 

tgaatagtcg actttgttce cactgtaett ttagctcgta caaaataeaa tatacttttc 18051 

atttctccgt aaacaaeatg ttttcccatg taatatcett ttetattttt cgttccgtta 18111 

ccaactttae acatacttta tatagetatt cacttctata cactaaaaaa ctaagaeaat 18171 

tttaattttg ctgeetgeca tatttcaatt tgttataaat tectataatt tatcctatta 18231 

gtagctaaaa aaagatgaat gtgaatcgaa tcctaagaga attggatctg atceacagga 18291 

cgggtgtggt cgccatgatc gegtagtcga tagtggctcc aagtagcgaa gcgagcagga 18351 

ctgggcggeg gccaaagcgg tcggacagtg ctccgagaac gggtgcgcat agaaattgca 18411 

tcaaegcata tagcgctage ageacgecat agtgactgge gatgctgtcg gaatggaega 18471 

tatcccgeaa gaggceegge agtaceggea taaeeaagee tatgeetaea geatccaggg 18531 

tgacggtgcc gaggatgaeg atgagegcat tgttagattt cataeacggt geetgactge 18591 

gttagcaatt taaetgtgat aaactaccge attaaagett tttctttcca attttttttt 18651 

tttegtcatt ataaaaatca ttacgacega gattcceggg taataactga tataattaaa 18711 

ttgaagctet aatttgtgag tttagtatac atgeatttac ttataataca gttttttagt 18771 

tttgctggcc geatcttcte aaatatgett cceagcctgc ttttctgtaa cgttcaccct 18831 

etaccttage atecctteee tttgcaaata gteetettcc aacaataata atgtcagatc 18891 

ctgtagagac cacatcatec aeggttctat actgttgacc caatgegtet cecttgtcat 18951 
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ctaaacccac 


accgggtgtc 


ataatcaacc 


aatcgtaacc 


ttcatctctt 


ccacccatgt 


19011 


ctctttgagc 


aataaagccg 


ataacaaaat 


ctttgtcgct 


cttcgcaatg 


tcaacagtac 


19071 


ccttagtata 


ttctccagta 


gatagggagc 


ccttgcatga 


caattctgct 


aacatcaaaa 


19131 


ggcctctagg 


ttcctttgtt 


acttcttctg 


ccgcctgctt caaaccgcta acaatacctg 


19191 


ggcccaccac 


accgtgtgca 


ttcgtaatgt 


ctgcccattc 


tgctattctg 


tatacacccg 


19251 


cagagtactg 


caatttgact 


gtattaccaa 


tgtcagcaaa 


ttttctgtct 


tcgaagagta 


19311 


aaaaattgta . 


cttggcggat 


aatgccttta 


gcggcttaac 


tgtgccctcc 




19371 


cagtcaagat 


atccacatgt 


gtttttagta 


aacaaatttt 


gggacctaat 


gcttcaacta 


19431 


actccagtaa 


ttccttggtg 


gtacgaacat 


ccaatgaagc 


acacaagttt 


gtttgctttt 


19491 


cgtgcatgat 


attaaatagc 


ttggcagcaa 


caggactagg 


atgagtagca gcacgttcct 


19551 


tatatgtagc 


tttcgacatg 


atttatcttc 


gtttcctgca ggtttttgtt 


ctgtgcagtt 


19611 


gggttaagaa 


tactgggcaa 


tttcatgttt 


cttcaacact 


acatatgcgt 


atatatacca 


19671 


atctaagtct 


gtgctccttc 


cttcgttctt 


ccttctgttc ggagattacc 


gaatcaaaaa 


19731 


aatttcaagg 


aaaccgaaat 


caaaaaaaag 


aataaaaaaa 


aaatgatgaa 


ttgaaaagct 


19791 


tatcgat 












19798 



<210> 11 
<211> 1771 
<212> PRT 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: 
pd . del taNS3NS5 . p j 

<400> 11 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 

15 10 15 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His 
20 25 30 

Gly He Asp Pro Asn He Arg Thr Gly Val Arg Thr He Thr Thr Gly 
35 40 45 

Ser Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly 
50 55 60 

Cys Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser 
65 70 75 80 

Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala 
85 90 95 
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Glu Thr Ala Oly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro 
100 105 110 

Gly Ser Val Thr Val Pro His Pro Ash lie Glu Glu Val Ala Leu Ser 
115 120 125 

Thr Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val 
130 135 140 

He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys 
145 150 155 160 

Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala 
165 170 175 

Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val 
180 185 190 

Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 
195 200 205 

Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
210 215 220 

Ser Leu Asp Pro Thr Phe Thr He Glu Thr He Thr Leu Pro Gin Asp 
225 230 235 240 

Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro 
245 250 255 

Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe 
260 265 270 

Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr 
275 280 285 

Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn 
290 295 300 

Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly 
305 310 315 320 

Val Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr 
325 330 335 

Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr 
340 345 350 

Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp 

355 360 365 

Lys Cys Leu He Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu 
370 375 380 

Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu He Thr Leu Thr His Pro 
385 390 395 400 



Val Thr Lys Tyr He Met Thr Cys Met Ser Ala Asp Leu Glu Val Val 
405 410 415 
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Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
420 425 430 

Tyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg Val Val Leu 
425 440 445 

Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Arg Glu 
450 455 460 



Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin 
465 470 475 480 

Gly Net Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu 
485 490 495 

Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala Val Gin Thr 
500 505 510 

Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe 
515 520 525 

He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 
530 535 540 



Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro 
545 550 555 560 

Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val 
565 570 575 

Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 
580 585 590 

Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly Leu Gly Lys Val Leu 
595 600 605 

He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 
610 615 620 

Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 
625 630 635 640 

Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val 
645 650 655 

Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
660 665 670 

Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 
675 680 685 

Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
690 695 700 

Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu 
705 710 715 720 



His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp 
725 730 735 
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Leu Arg Asp lie Trp Asp Trp lie Cys 61u Val Leu Ser Asp Phe Lys 
740 745 750 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly lie Pro Phe 

755 760 765 

Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly lie 
770 775 780 

Met His Thr Arg Cys His Cys Gly Ala Glu lie Thr Gly His Val Lys 
785 790 795 800 

Asn Gly Thr Met Arg lie Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
805 810 815 

Ser Gly Thr Phe Pro lie Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
820 825 830 

Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser Ala Glu 
835 840 845 

Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe His Tyr Val Thr Gly 
850 855 860 

Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser Pro Glu 
865 870 875 880 

Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro 
885 890 895 

Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His 
900 905 910 

Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val 
915 920 925 

Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu 
930 935 940 

Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala Ser 
945 950 955 960 

Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 
965 970 975 

Ala Asn His Asp Ser Pro Asp Ala Glu Leu He Glu Ala Asn Leu Leu 
980 985 990 

Trp Arg Gin Glu Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn 
995 1000 1005 

Lys Val Val He Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp 
1010 1015 1020 

Glu Arg Glu He Ser Val Pro Ala Glu He Leu Arg Lys Ser Arg Arg 
025 1030 1035 1040 

Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro 
1045 1050 1055 
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Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His 
1060 1065 1070 

Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro Pro Arg 
1075 1080 1085 

Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu 
1090 1095 1100 

Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser Gly lie 
105 1110 1115 1120 

Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys 
1125 1130 1135 

Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu 
1140 1145 1150 

Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val 
1155 1160 1165 

Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met Ser Tyr 
1170 1175 1180 

Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys 
185 1190 1195 1200 

Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu 
. 1205 1210 1215 

Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val 
1220 1225 1230 

Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu 
1235 1240 1245 

Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser 
1250 1255 1260 

Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys 
265 1270 1275 1280 

Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val 
1285 1290 1295 

Thr His He Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr 
1300 1305 1310 

Pro He Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin 
1315 1320 1325 

Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp 
1330 1335 1340 

Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr 
345 1350 1355 1360 

Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser 
1365 1370 1375 
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Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys 
1380 1385 1390 

Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 
1395 1400 1405 

Thr Glu Ser Asp lie Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp 
1410 1415 1420 

Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser Leu Thr Glu Arg Leu 
425 1430 1435 1440 

Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr 
1445 1450 1455 

Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr 
1460 1465 1470 

Leu Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu 
1475 1480 1485 

Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val lie Cys 
1490 1495 1500 

Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr 
505 1510 1515 1520 

Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro 
1525 1530 1535 

Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val 
1540 1545 1550 

Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro 
1555 1560 1565 

Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro 
1570 1575 1580 

Val Asn Ser Trp Leu Gly Asn lie lie Met Phe Ala Pro Thr Leu Trp 
585 1590 1595 1600 

Ala Arg Met lie Leu Met Thr His Phe Phe Ser Val Leu lie Ala Arg 
1605 1610 1615 

Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu lie Tyr Gly Ala Cys Tyr 
1620 1625 1630 

Ser lie Glu Pro Leu Asp Leu Pro Pro lie lie Gin Arg Leu His Gly 
1635 1640 1645 

Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu lie Asn Arg 
1650 1655 1660 

Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp 
665 1670 1675 1680 

Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly 
1685 1690 1695 
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Arg Ala Ala lie Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr 
1700 1705 1710 

Lys Leu Lys Leu Thr Pro lie Ala Ala Ala Gly Gin Leu Asp Leu Ser 
1715 1720 1725 

Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val 
1730 1735 1740 

Ser His Ala Arg Pro Arg Trp He Trp Phe Cys Leu Leu Leu Leu Ala 
745 1750 1755 1760 

Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg 
1765 1770 



<210> 12 
<211> 20220 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
pd . delta . NS3NS5 . p j . corel2 1 

<220> 
<221> CDS 

<222> (12679) . . (18354) 



<400> 12 
atcgatccta 


ccccttgcgc 


taaagaagta 


tatgtgccta 


ctaacgcttg 


tctttgtctc 


60 


tgtcactaaa 


cactggatta 


ttactcccag 


atacttattt 


tggactaatt 


taaatgattt 


120 


cggatcaacg 


ttcttaatat 


cgctgaatct 


tccacaattg 


atgaaagtag 


ctaggaagag 


180 


gaattggtat 


aaagtttttg 


tttttgtaaa 


tctcgaagta 


tactcaaacg 


aatttagtat 


240 


tttctcagtg 


atctcccaga 


tgctttcacc 


ctcacttaga 


agtgctttaa 


gcattttttt 


300 


actgtggcta 


tttcccttat 


ctgcttcttc 


cgatgattcg 


aactgtaatt 


gcaaactact 


360 


tacaatatca 


gtgatatcag 


attgatgttt 


ttgtccatag 


taaggaataa 


ttgtaaattc 


420 


ccaagcagga 


atcaatttct 


ttaatgaggc 


ttccagaatt 


gttgcttttt 


gcgtcttgta 


480 


tttaaactgg 


agtgatttat 


tgacaatatc 


gaaactcagc 


gaattgctta 


tgatagtatt 


540 


atagctcatg 


aatgtggctc 


tcttgattgc 


tgttccgtta 


tgtgtaatca 


tccaacataa 


600 


ataggttagt 


tcagcagcac 


ataatgctat 


tttctcacct 


gaaggtcttt 


caaacctttc 


660 


cacaaactga 


cgaacaagca 


ccttaggtgg 


tgttttacat 


aatatatcaa 


attgtggcat 


720 


gcttagcgcc 


gatcttgtgt 


gcaattgata 


tctagtttca 


actactctat 


ttatcttgta 


780 


tcttgcagta 


ttcaaacacg 


ctaactcgaa 


aaactaactt 


taattgtcct 


gtttgtctcg 


840 
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cgttctttcg 


aaaaatgcac 


cggccgcgca 


ttatttgtac 


tgcgaaaata 


attggtactg 


900 


cggtatcttc 


atttcatatt 


ttaaaaatgc 


acctttgctg 


cttttcctta 


atttttagac 


960 


ggcccgcagg 


ttcgttttgc 


ggtactatct 


tgtgataaaa 


agttgttttg 


acatgtgatc 


1020 


tgcacagatt 


ttataatgta 


ataagcaaga 


atacattatc 


aaacgaacaa 


tactggtaaa 


1080 


agaaaaccaa 


aatggacgac 


attgaaacag 


ccaagaatct 


gacggtaaaa 


gcacgtacag 


1140 


cttatagcgt 


ctgggatgta 


tgtcggctgt 


ttattgaaat 


gattgctcct 


gatgtagata 


1200 


ttgatataga 


gagtaaacgt 


aagtctgatg 


agctactctt 


tccaggatat 


gtcataaggc 


1260 


ccatggaatc 


tctcacaacc 


ggtaggccgt 


atggtcttga 


ttctagcgca 


gaagattcca 


1320 


gcgtatcttc 


tgactccagt 


gctgaggtaa 


ttttgcctgc 


tgcgaagatg 


gttaaggaaa 


1380 


ggtttgattc 


gattggaaat 


ggtatgctct 


cttcacaaga 


agcaagtcag 


gctgccatag 


1440 


atttgatgct 


acagaataac 


aagctgttag 


acaatagaaa 


gcaactatac 


aaatctattg 


1500 


ctataataat 


aggaagattg 


cccgagaaag 


acaagaagag 


agctaccgaa 


atgctcatga 


1560 


gaaaaatgga 


ttgtacacag 


ttattagtcc 


caccagctcc 


aacggaagaa 


gatgttatga 


1620 


agctcgtaag 


cgtcgttacc 


caattgctta 


ctttagttcc 


accagatcgt 


caagctgctt 


1660 


taataggtga 


tttattcatc 


ccggaatctc 


taaaggatat 


attcaatagt 


ttcaatgaac 


1740 


tggcggcaga 


gaatcgttta 


cagcaaaaaa 


agagtgagtt 


ggaaggaagg 


actgaagtga 


1800 


accatgctaa 


tacaaatgaa 


gaagttccct 


ccaggcgaac 


aagaagtaga 


gacacaaatg 


1860 


caagaggagc 


atataaatta 


caaaacacca 


tcactgaggg 


ccctaaagcg 


gttcccacga 


1920 


aaaaaaggag 


agtagcaacg 


agggtaaggg 


gcagaaaatc 


acgtaatact 


tctagggtat 


1980 


gatccaatat 


caaaggaaat 


gatagcattg 


aaggatgaga 


ctaatccaat 


tgaggagtgg 


2040 


cagcatatag 


aacagctaaa 


gggtagtgct 


gaaggaagca 


tacgataccc 


cgcatggaat 


2100 


gggataatat 


cacaggaggt 


actagactac 


ctttcatcct 


acataaatag 


acgcatataa 


2160 


gtacgcattt 


aagcataaac 


acgcactatg 


ccgttcttct 


catgtatata 


tatatacagg 


2220 


caacacgcag 


atataggtgc 


gacgtgaaca 


gtgagctgta 


tgtgcgcagc 


tcgcgttgca 


2280 


ttttcggaag 


cgctcgtttt 


cggaaacgct 


ttgaagttcc 


tattccgaag 


ttcctattct 


2340 


ctagaaagta 


taggaacttc 


agagcgcttt 


tgaaaaccaa 


aagcgctctg 


aagacgcact 


2400 


ttcaaaaaac 


caaaaacgca 


ccggactgta 


acgagctact 


aaaatattgc 


gaataccgct 


2460 


tccacaaaca 


ttgctcaaaa 


gtatctcttt 


gctatatatc 


tctgtgctat 


atccctatat 


2520 


aacctaccca 


tccacctttc 


gctccttgaa 


cttgcatcta 


aactcgacct 


ctacatcaac 


2580 


aggcttccaa 


tgctcttcaa 


attttactgt 


caagtagacc 


catacggctg 


taatatgctg 


2640 
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ctcttcataa 


tgtaagctta 


tctttatcga 


atcgtgtgaa 


aaactactac 


cgcgataaac 


2700 


ctttacggtt 


ccctgagatt 


gaattagttc 


ctttagtata 


tgatacaaga 


cacttttgaa 


2760 


ctttgtacga 


cgaattttga 


ggttcgccat 


cctctggcta 


tttccaatta 


tcctgtcggc 


2820 


tattatctcc 


gcctcagttC 


gatcttccgc 


ttcagactgc 


catttttcac 


ataatgaatc 


2880 


tatttcaccc 


cacaatcctt 


catccgcctc 


cgcatcttgt 


tccgttaaac 


tattgacttc 


2940 


atgttgtaca 


ttgtttagtt 


cacgagaagg 


gtcctcttca 


ggcggtagct 


cctgatctcc 


3000 


tatatgacct 


ttatcctgtt 


ctctttccac 


aaacttagaa 


atgtattcat 


gaattatgga 


3060 


gcacctaata 


acattcttca 


^99cggagaa 


gtttgggcca 


gatgcccaat 


atgcttgaca 


3120 


tgaaaacgtg 


agaatgaatt 


tagtattatt 


gtgatattct 


gaggcaattt 


tattataatc 


3180 


tcgaagataa 


gagaagaatg 


cagtgacctt 


tgtattgaca 


aatggagatt 


ccatgtatct 


3240 


aaaaaatacg 


cctttaggcc 


ttctgatacc 


ctttcccctg 


cggtttagcg 


tgccttttac 


3300 


attaatatct 


aaaccctctc 


cgatggtggc 


ctttaactga 


ctaataaatg 


caaccgatat 


3360 


aaactgtgat 


aattctgggt 


gatttatgat 


tcgatcgaca 


attgtattgt 


acactagtgc 


3420 


aggatcaggc 


caatccagtt 


ctttttcaat 


taccggtgtg 


tcgtctgtat 


tcagtacatg 


3480 


tccaacaaat 


gcaaatgcta 


acgttttgta 


tttcttataa 


ttgtcaggaa 


ctggaaaagt 


3540 


cccccttgtc 


gtctcgatta 


cacacctact 


ttcatcgtac 


accataggtt 


ggaagtgctg 


3600 


cataatacat 


tgcttaatac 


aagcaagcag 


tctctcgcca 


ttcatatttc 


agttattttc 


3660 


cattacagct 


gatgtcattg 


tatatcagcg 


ctgtaaaaat 


ctatctgtta 


cagaaggttt 


3720 


tcgcggtttt 


tataaacaaa 


actttcgtta 


cgaaatcgag 


caatcacccc 


agctgcgtat 


3780 


ttggaaattc 


gggaaaaagt 


agagcaacgc 


gagttgcatt 


ttttacacca 


taatgcatga 


3840 


ttaacttcga 


gaagggatta 


aggctaattt 


cactagtatg 


tttcaaaaac 


ctcaatctgt 


3900 


ccattgaatg 


ccttataaaa 


cagctataga 


ttgcatagaa 


gagttagcta 


ctcaatgctt 


3960 


tttgtcaaag 


cttactgatg 


atgatgtgtc 


tactttcagg 


cgggtctgta 


gtaaggagaa 


4020 


tgacattata 


aagctggcac 


ttagaattcc 


acggactata 


gactatacta 


gtatactccg 


4080 


tctactgtac 


gatacacttc 


cgctcaggtc 


cttgtccttt 


aacgaggcct 


taccactctt 


4140 


ttgttactct 


attgatccag 


ctcagcaaag 


gcagtgtgat 


ctaagattct 


atcttcgcga 


4200 


tgtagtaaaa 


ctagctagac 


cgagaaagag 


actagaaatg 


caaaaggcac 


ttctacaatg 


4260 


gctgccatca 


ttattatccg 


atgtgacgct 


gcattttttt 


tttttttttt 


tttttttttt 


4320 


tttttttttt 


tttttttttt 


ttttttggta 


caaatatcat 


aaaaaaagag 


aatcttttta 


4380 


agcaaggatt 


ttcttaactt 


cttcggcgac 


agcatcaccg 


acttcggtgg 


tactgttgga 


4440 
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accacctaaa 


tcaccagttc 


tgatacctgc 


atccaaaacc 


tttttaactg 


catcttcaat 


4500 


ggctttacct 


tcttcaggca 


agttcaatga 


caatttcaac 


atcattgcag 


cagacaagat 


4560 


agtggcgata 


gggttgacct 


tattctttgg 


caaatctgga 


gcggaaccat 


ggcatggttc 


4620 


gtacaaacca 


aatgcggtgt 


tcttgtctgg 


caaagaggcc 


aaggacgcag 


atggcaacaa 


4680 


acccaaggag 


cctgggataa 


cggaggcttc 


atcggagatg 


atatcaccaa 


acatgttgct 


4740 


ggtgattata 


ataccattta 


ggtgggttgg 


gttcttaact 


aggatcatgg 


cggcagaatc 


4800 


aatcaattga 


tgttgaactt 


tcaatgtagg 


gaattcgttc 


ttgatggttt 


cctccacagt 


4860 


ttttctccat 


aatcttgaag 


aggccaaaac 


attagcttta 


tccaaggacc 


aaataggcaa 


4920 


tggtggctca 


tgttgtaggg 


ccatgaaagc 


ggccattctt 


gtgattcttt 


gcacttctgg 


4980 


aacggtgtat 


tgttcactat 


cccaagcgac 


accatcacca 


tcgtcttcct 


ttctcttacc 


5040 


aaagtaaata 


cctcccacta 


attctctaac 


aacaacgaag 


tcagtacctt 


tagcaaattg 


5100 


tggcttgatt 


ggagataagt 


ctaaaagaga 


gtcggatgca 


aagttacatg 


gtcttaagtt 


5160 


ggcgtacaat 


tgaagttctt 


tacggatttt 


tagtaaacct 


tgttcaggtc 


taacactacc 


5220 


ggtaccccat 


ttaggaccac 


ccacagcacc 


taacaaaacg 


gcatcagcct 


tcttggaggc 


5280 


ttccagcgcc 


tcatctggaa 


gtggaacacc 


tgtagcatcg 


atagcagcac 


caccaattaa 


5340 


atgattttcg 


aaatcgaact 


tgacattgga 


acgaacatca 


gaaatagctt 


taagaacctt 


5400 


aatggcttcg 


gctgtgattt 


cttgaccaac 


gtggtcacct 


ggcaaaacga 


cgatcttctt 


5460 


aggggcagac 


attacaatgg 


tatatccttg 


aaatatatat 


aaaaaaaaaa 


aaaaaaaaaa 


5520 


aaaaaaaaaa 


atgcagcttc 


tcaatgatat 


tcgaatacgc 


tttgaggaga 


tacagcctaa 


5580 


tatccgacaa 


actgttttac 


agatttacga 


tcgtacttgt 


tacccatcat 


tgaattttga 


5640 


acatccgaac 


ctgggagttt 


tccctgaaac 


agatagtata 


tttgaacctg 


tataataata 


5700 


tatagtctag 


cgctttacgg 


aagacaatgt 


atgtatttcg 


gttcctggag 


aaactattgc 


5760 


atctattgca 


taggtaatct 


tgcacgtcgc 


atccccggtt 


cattttctgc 


gtttccatct 


5820 


tgcacttcaa 


tagcatatct 


ttgttaacga 


agcatctgtg 


cttcattttg 


tagaacaaaa 


5880 


atgcaacgcg 


agagcgctaa 


tttttcaaac 


aaagaatctg 


agctgcattt 


ttacagaaca 


5940 


gaaatgcaac 


gcgaaagcgc 


tattttacca 


acgaagaatc 


tgtgcttcat 


ttttgtaaaa 


6000 


caaaaatgca 


acgcgagagc 


gctaattttt 


caaacaaaga 


atctgagctg 


catttttaca 


6060 


gaacagaaat 


gcaacgcgag 


agcgctattt 


taccaacaaa 


gaatctatac 


ttcttttttg 


6120 


ttctacaaaa 


atgcatcccg 


agagcgctat 


ttttctaaca 


aagcatctta 


gattactttt 


6180 


tttctccttt 


gtgcgctcta 


taatgcagtc 


tcttgataac 


tttttgcact 


gtaggtccgt 


6240 
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taaggttaga 


agaaggctac 


tttggtgtct 


attttctctt 


ccataaaaaa 


agcctgactc 


6300 


cacttcccgc 


gtttactgat 


tactagcgaa 


gctgcgggtg 


cattttttca 


agataaaggc 


6360 


atccccgatt 


atattctata 


ccgatgtgga 


ttgcgcatac 


tttgtgaaca 


gaaagtgata 


6420 


gcgttgatga 


ttcttcattg 


gtcagaaaat 


tatgaacggt 


ttcttctatt 


ttgtctctat 


6480 


atactacgta 


taggaaatgt 


ttacattttc 


gtattgtttt 


cgattcactc 


tatgaatagt 


6540 


tcttactaca 


atttttttgt 


ctaaagagta 


atactagaga 


taaacataaa 


aaatgtagag 


6600 


gtcgagttta 


gatgcaagtt 


caaggagcga 


aaggtggatg 


ggtaggttat 


atagggatat 


6660 


agcacagaga 


tatatagcaa 


agagatactt 


ttgagcaatg 


tttgtggaag 


cggtattcgc 


6720 


aatattttag 


tagctcgtta 


cagtccggtg 


cgtttttggt 


tttttgaaag 


tgcgtcttca 


6780 


gagcgctttt 


ggttttcaaa 


agcgctctga 


agttcctata 


ctttctagag 


aataggaact 


6840 


tcggaatagg 


aacttcaaag 


cgtttccgaa 


aacgagcgct 


tccgaaaatg 


caacgcgagc 


6900 


tgcgcacata 


cagctcactg 


ttcacgtcgc 


acctatatct 


gcgtgttgcc 


tgtatatata 


6960 


tatacatgag 


aagaacggca 


tagtgcgtgt 


ttatgcttaa 


atgcgtactt 


atatgcgtct 


7020 


atttatgtag 


gatgaaaggt 


agtctagtac 


ctcctgtgat 


attatcccat 


tccatgcggg 


7080 


gtatcgtatg 


cttccttcag 


cactaccctt 


tagctgttct 


atatgctgcc 


actcctcaat 


7140 


tggattagtc 


tcatccttca 


atgctatcat 


ttcctttgat 


attggatcat 


atgcatagta 


7200 


ccgagaaacc 


agtgcgaagt 


agtgatcagg 


tattgctgtt 


atctgatgag 


tatacgttgt 


7260 


cctggccacg 


gcagaagcac 


gcttatcgct 


ccaatttccc 


acaacattag 


tcaactccgt 


7320 


taggcccttc 


attgaaagaa 


atgaggtcat 


caaatgtctt 


ccaatgtgag 


attttgggcc 


7380 


attttttata 


gcaaagattg 


aataaggcgc 


atttttcttc 


aaagctttat 


tgtacgatct 


7440 


gactaagtta 


tcttttaata 


attggtattc 


ctgtttattg 


cttgaagaat 


tgccggtcct 


7500 


atttactcgt 


tttaggactg 


gttcagaatt 


cctcaaaaat 


tcatccaaat 


atacaagtgg 


7560 


atcgatgata 


agctgtcaaa 


catgagaatt 


cttgaagacg 


aaagggcctc 


gtgatacgcc 


7620 


tatttttata 


ggttaatgtc 


atgataataa 


tggtttctta 


gacgtcaggt 


ggcacttttc 


7680 


ggggaaatgt 


gcgcggaacc 


cctatttgtt 


tatttttcta 


aatacattca 


aatatgtatc 


7740 


cgctcatgag 


acaataaccc 


tgataaatgc 


ttcaataata 


ttgaaaaagg 


aagagtatga 


7800 


gtattcaaca 


tttccgtgtc 


gcccttattc 


ccttttttgc 


ggcattttgc 


cttcctgttt 


7860 


ttgctcaccc 


agaaacgctg 


gtgaaagtaa 


aagatgctga 


agatcagttg 


ggtgcacgag 


7920 


tgggttacat 


cgaactggat 


ctcaacagcg 


gtaagatcct 


tgagagtttt 


cgccccgaag 


7980 


aacgttttcc 


aatgatgagc 


acttttaaag 


ttctgctatg 


tggcgcggta 


ttatcccgtg 


8040 
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ttgacgccgg 


gcaagagcaa 


ctcggtcgcc 


gcatacacta 


ttctcagaat 


gacttggttg 


8100 


agtactcacc 


agtcacagaa 


aagcatctta 


cggatggcat 


gacagtaaga 


gaattatgca 


8160 


gtgctgccat 


aaccatgagt 


gataacactg 


cggccaactt 


acttctgaca 


acgatcggag 


8220 


gaccgaagga 


gctaaccgct 


tttttgcaca 


acatggggga 


tcatgtaact 


cgccttgatc 


8280 


gttgggaacc 


ggagctgaat 


gaagccatac 


caaacgacga 


gcgtgacacc 


acgatgcctg 


8340 


cagcaatggc 


aacaacgttg 


cgcaaactat 


taactggcga 


actacttact 


ctagcttccc 


8400 


ggcaacaatt 


aatagactgg 


atggaggcgg 


ataaagttgc 


aggaccactt 


ctgcgctcgg 


8460 


cccttccggc 


tggctggttt 


attgctgata 


aatctggagc 


cggtgagcgt 


gggtctcgcg 


8520 


gtatcattgc 


agcactgggg 


ccagatggta 


agccctcccg 


tatcgtagtt 


atctacacga 


8580 


cggggagtca 


ggcaactatg 


gatgaacgaa 


atagacagat 


cgctgagata 


ggtgcctcac 


8640 


tgattaagca 


ttggtaactg 


tcagaccaag 


tttactcata 


tatactttag 


attgatttaa 


8700 


aacttcattt 


ttaatttaaa 


aggatctagg 


tgaagatcct 


ttttgataat 


ctcatgacca 


8760 


aaatccctta 


acgtgagttt 


tcgttccact 


gagcgtcaga 


ccccgtagaa 


aagatcaaag 


8820 


gatcttcttg 


agatcctttt 


tttctgcgcg 


taatctgctg 


cttgcaaaca 


aaaaaaccac 


8880 


cgctaccagc 


ggtggtttgt 


ttgccggatc 


aagagctacc 


aactcttttt 


ccgaaggtaa 


8940 


ctggcttcag 


cagagcgcag 


ataccaaata 


ctgtccttct 


agtgtagccg 


tagttaggcc 


9000 


accacttcaa 


gaactctgta 


gcaccgccta 


catacctcgc 


tctgctaatc 


ctgttaccag 


9060 


tggctgctgc 


cagtggcgat 


aagtcgtgtc 


ttaccgggtt 


ggactcaaga 


cgatagttac 


9120 


cggataaggc 


gcagcggtcg 


ggctgaacgg 


ggggttcgtg 


cacacagccc 


agcttggagc 


9180 


gaacgaccta 


caccgaactg 


agatacctac 


agcgtgagct 


atgagaaagc 


gccacgcttc 


9240 


ccgaagggag 


aaaggcggac 


aggtatccgg 


taagcggcag 


ggtcggaaca 


ggagagcgca 


9300 


cgagggagct 


tccaggggga 


aacgcctggt 


atctttatag 


tcctgtcggg 


tttcgccacc 


9360 


tctgacttga 


gcgtcgattt 


ttgtgatgct 


cgtcaggggg 


gcggagccta 


tggaaaaacg 


9420 


ccagcaacgc 


ggccttttta 


cggttcctgg 


ccttttgctg 


gccttttgct 


cacatgttct 


9480 


ttcctgcgtt 


atcccctgat 


tctgtggata 


accgtattac 


cgcctttgag 


tgagctgata 


9540 


ccgctcgccg 


cagccgaacg 


accgagcgca 


gcgagtcagt 


gagcgaggaa 


gcggaagagc 


9600 


gcctgatgcg 


gtattttctc 


cttacgcatc 


tgtgcggtat 


ttcacaccgc 


atatggtgca 


9660 


ctctcagtac 


aatctgctct 


gatgccgcat 


agttaagcca 


gtatacactc 


cgctatcgct 


9720 


acgtgactgg 


gtcatggctg 


cgccccgaca 


cccgccaaca 


cccgctgacg 


cgccctgacg 


9780 


ggcttgtctg 


ctcccggcat 


ccgcttacag 


acaagctgtg 


accgtctccg 


ggagctgcat 


9840 
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gtgtcagagg 


ttttcaccgt 


catcaccgaa 


acgcgcgagg 


cagctgcggt 


aaagctcatc 


9900 


agcgtggtcg 


tgaagcgatt 


cacagatgtc 


tgcctgttca 


tccgcgtcca 


gctcgttgag 


9960 


tttctccaga 


agcgttaatg 


tctggcttct 


gataaagcgg 


gccatgttaa 


gggcggtttt 


10020 


ttcctgtttg 


gtcactgatg 


cctccgtgta 


agggggattt 


ctgttcatgg 


gggtaatgat 


10080 


accgatgaaa 


cgagagagga 


tgctcacgat 


acgggttact 


gatgatgaac 


atgcccggtt 


10140 


actggaacgt 


tgtgagggta 


aacaactggc 


ggtatggatg 


cggcgggacc 


agagaaaaat 


10200 


cactcagggt 


caatgccagc 


gcttcgttaa 


tacagatgta 


ggtgttccac 


agggtagcca 


10260 


gcagcatcct 


gcgatgcaga 


tccggaacat 


aatggtgcag 


ggcgctgact 


tccgcgtttc 


10320 


cagactttac 


gaaacacgga 


aaccgaagac 


cattcatgtt 


gttgctcagg 


tcgcagacgt 


10380 


tttgcagcag 


cagtcgcttc 


acgttcgctc 


gcgtatcggt 


gattcattct 


gctaaccagt 


10440 


aaggcaaccc 


cgccagccta 


gccgggtcct 


caacgacagg 


agcacgatca 


tgcgcacccg 


10500 


tggccaggac 


ccaacgctgc 


ccgagatgcg 


ccgcgtgcgg 


ctgctggaga 


tggcggacgc 


10560 


gatggatatg 


ttctgccaag 


ggttggtttg 


cgcattcaca 


gttctccgca 


agaattgatt 


10620 


ggctccaatt 


cttggagtgg 


tgaatccgtt 


agcgaggtgc 


cgccggcttc 


cattcaggtc 


10680 


gaggtggccc 


ggctccatgc 


accgcgacgc 


aacgcgggga 


ggcagacaag 


gtatagggcg 


10740 


gcgcctacaa 


tccatgccaa 


cccgttccat 


gtgctcgccg 


aggcggcata 


aatcgccgtg 


10800 


acgatcagcg 


gtccaatgat 


cgaagttagg 


ctggtaagag 


ccgcgagcga 


tccttgaagc 


10860 


tgtccctgat 


ggtcgtcatc 


tacctgcctg 


gacagcatgg 


cctgcaacgc 


gggcatcccg 


10920 


atgccgccgg 


aagcgagaag 


aatcataatg 


gggaaggcca 


tccagcctcg 


cgtcgcgaac 


10980 


gccagcaaga 


cgtagcccag 


cgcgtcggcc 


gccatgccgg 


cgataatggc 


ctgcttctcg 


11040 


ccgaaacgtt 


tggtggcggg 


accagtgacg 


aaggcttgag 


cgagggcgtg 


caagattccg 


11100 


aataccgcaa 


gcgacaggcc 


gatcatcgtc 


gcgctccagc 


gaaagcggtc 


ctcgccgaaa 


11160 


atgacccaga 


gcgctgccgg 


cacctgtcct 


acgagttgca 


tgataaagaa 


gacagtcata 


11220 


agtgcggcga 


cgatagtcat 


gccccgcgcc 


caccggaagg 


agctgactgg 


gttgaaggct 


11280 


ctcaagggca 


tcggtcgagg 


atccttcaat 


atgcgcacat 


acgctgttat 


gttcaaggtc 


11340 


ccttcgttta 


agaacgaaag 


cggtcttcct 


tttgagggat 


gtttcaagtt 


gttcaaatct 


11400 


atcaaatttg 


caaatcccca 


gtctgtatct 


agagcgttga 


atcggtgatg 


cgatttgtta 


11460 


attaaattga 


tggtgtcacc 


attaccaggt 


ctagatatac 


caatggcaaa 


ctgagcacaa 


11520 


caataccagt 


ccggatcaac 


tggcaccatc 


tctcccgtag 


tctcatctaa 


tttttcttcc 


11580 


ggatgaggtt 


ccagatatac 


cgcaacacct 


ttattatggt 


ttccctgagg 


gaataataga 


11640 
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atgtcccatt 


cgaaatcacc 


aattctaaac 


ctgggcgaat 


tgtatttcgg 


gtttgttaac 


11700 


tcgttccagt 


caggaatgtt 


ccacgtgaag 


ctatcttcca 


gcaaagtctc 


cacttcttca 


11760 


tcaaattgtg 


gagaatactc 


ccaatgctct 


tatctatggg 


acttccggga 


aacacagtac 


11820 


cgatacttcc 


caattcgtct 


tcagagctca 


ttgtttgttt 


gaagagacta 


atcaaagaat 


11880 


cgttttctca 


aaaaaattaa 


tatcttaact 


gatagtttga 


tcaaaggggc 


aaaacgtagg 


11940 


ggcaaacaaa 


cggaaaaatc 


gtttctcaaa 


ttttctgatg 


ccaagaactc 


taaccagtct 


12000 


tatctaaaaa 


ttgccttatg 


atccgtctct 


ccggttacag 


cctgtgtaac 


tgattaatcc 


12060 


tgcctttcta 


atcaccattc 


taatgtttta 


attaagggat 


tttgtcttca 


ttaacggctt 


12120 


tcgctcataa 


aaatgttatg 


acgttttgcc 


cgcaggcggg 


aaaccatcca 


cttcacgaga 


12180 


ctgatctcct 


ctgccggaac 


accgggcatc 


tccaacttat 


aagttggaga 


aataagagaa 


12240 


tttcagattg 


agagaatgaa 


aaaaaaaaac 


ccttagttca 


taggtccatt 


ctcttagcgc 


12300 


aactacagag 


aacaggggca 


caaacaggca 


aaaaacgggc 


acaacctcaa 


tggagtgatg 


12360 


caacctgcct 


ggagtaaatg 


atgacacaag 


gcaattgacc 


cacgcatgta 


tctatctcat 


12420 


tttcttacac 


cttctattac 


cttctgctct 


ctctgatttg 


gaaaaagctg 


aaaaaaaagg 


12480 


ttgaaaccag 


ttccctgaaa 


ttattcccct 


acttgactaa 


taagtatata 


aagacggtag 


12540 


gtattgattg 


taattctgta 


aatctatttc 


ttaaacttct 


taaattctac 


ttttatagtt 


12600 


agtctttttt 


ttagttttaa 


aacaccaaga 


acttagtttc 


gaataaacac 


acataaacaa 


12660 



acaagcttac aaaacaaa atg get gca tat gca get cag ggc tat aag gtg 12711 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
15 10 

eta gta etc aac ccc tot gtt get gea aea etg ggc ttt ggt get tac 12759 
Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
15 20 25 

atg tec aag get cat ggg ate gat cct aac ate agg acc ggg gtg aga 12807 
Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg 
30 35 40 

aea att acc act ggc age ccc ate acg tac tec acc tac ggc aag ttc 12855 
Thr He Thr Thr Gly Ser Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe 
45 50 55 

ctt gcc gac ggc ggg tge teg ggg ggc get tat gac ata ata att tgt 12903 
Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He Cys 
60 65 70 75 

gac gag tgc cac tec acg gat gee aca tec ate ttg ggc att ggc act 12951 
Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu Gly He Gly Thr 
80 85 90 

gtc ctt gac caa gca gag act geg ggg geg aga etg gtt gtg etc gee 12999 
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Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
95 100 105 

acc gcc acc cct ccg ggc tec gtc act gtg ccc cat ccc aac ate gag 13047 
Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn lie Glu 
110 115 120 

gag gtt get etg tec acc acc gga gag ate cct ttt tac ggc aag get 13095 
Glu Val Ala Leu Ser Thr Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala 
125 130 135 

ate ccc etc gaa gta ate aag ggg ggg aga cat etc ate ttc tgt cat 13143 
He Pro Leu Glu Val He Lys Gly Gly Arg His Leu He Phe Cys His 
140 145 150 155 

tea aag aag aag tgc gae gaa etc gee gca aag ctg gtc gea ttg ggc 13191 
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly 
160 165 170 

ate aat gcc gtg gcc tac tac cgc ggt ctt gac gtg tec gtc ate ccg 13239 
He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
175 180 185 

ace age ggc gat gtt gtc gtc gtg gca ace gat gee etc atg ace ggc 13287 
Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
190 195 200 

tat acc ggc gae ttc gac teg gtg ata gac tgc aat acg tgt gtc ace 13335 
Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
205 210 215 

cag aca gtc gat ttc age ctt gac cct acc ttc acc att gag aca ate 13383 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr He 
220 225 230 235 

acg etc ccc caa gat get gtc tee cgc act caa cgt egg ggc agg act 13431 
Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr 
240 245 250 

ggc agg ggg aag cea ggc ate tac aga ttt gtg gea ccg ggg gag cgc 13479 
Gly Arg Gly Lys Pro Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg 
255 260 265 

ccc tec ggc atg ttc gac teg tec gtc etc tgt gag tgc tat gac gca 13527 
Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
270 275 280 

ggc tgt get tgg tat gag etc acg eee gcc gag act aca gtt agg eta 13575 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu 
285 290 295 

cga geg tac atg aac acc ccg ggg ctt ccc gtg tgc cag gae eat ctt 1362 3 
Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
300 305 310 315 

gaa ttt tgg gag ggc gtc ttt aca ggc etc act cat ata gat gee cac 13671 
Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His He Asp Ala His 
320 325 330 
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ttt eta tec eag aca aag cag agt ggg gag aac ctt cct tac ctg gta 13719 
Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val 
335 340 345 

gcg tac caa gee ace gtg tgc get agg get caa gee ect cce eca teg 13767 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
350 355 360 

tgg gac cag atg tgg aag tgt ttg att cgc etc aag ccc acc etc cat 13815 
Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His 
365 370 375 

ggg eca aca ccc ctg eta tac aga ctg ggc get gtt eag aat gaa ate 13863 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie 
380 385 390 395 

acc ctg aeg cac eca gte ace aaa tac ate atg aca tgc atg teg gee 13911 
Thr Leu Thr His Pro Val Thr Lys Tyr lie Met Thr Cys Met Ser Ala 
400 405 410 

gac ctg gag gte gte aeg age acc tgg gtg etc gtt ggc ggc gte ctg 13959 
Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu 
415 420 425 

get get ttg gcc gcg tat tgc ctg tea aca ggc tgc gtg gte ata gtg 14007 
Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val lie Val 
430 435 440 

ggc agg gte gte ttg tec ggg aag eeg gea ate ata ect gac agg gaa 14055 
Gly Arg Val Val Leu Ser Gly Lys Pro Ala lie lie Pro Asp Arg Glu 
445 450 455 

gte etc tac ega gag ttc gat gag atg gaa gag tgc tct cag cac tta 14103 
Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu 
460 465 470 475 

ccg tac ate gag caa ggg atg atg etc gcc gag cag ttc aag eag aag 14151 
Pro Tyr lie Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys 
480 485 490 

gcc etc ggc etc ctg cag ace gcg tec cgt eag gca gag gtt ate gcc 14199 
Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val lie Ala 
495 500 505 

ect get gte cag ace aac tgg caa aaa etc gag acc ttc tgg gcg aag 14247 
Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys 
510 515 520 

eat atg tgg aac ttc ate agt ggg ata caa tac ttg gcg ggc ttg tea 14295 
His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser 
525 530 535 

aeg ctg cct ggt aac ccc gcc att get tea ttg atg get ttt aca get 14343 
Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala 
540 545 550 555 

get gte acc age eca eta acc act age caa acc etc etc ttc aac ata 14391 
Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He 
560 565 570 
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ttg ggg ggg tgg gtg get gcc cag etc gee gee cec ggt gee get act 14439 
Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr 
575 580 585 

gcc ttt gtg ggc get ggc tta get ggc gcc gcc ate ggc agt gtt gga 14487 
Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly 
590 595 600 

ctg ggg aag gtc etc ata gac ate ctt gca ggg tat ggc gcg ggc gtg 14535 
Leu Gly Lys Val Leu He Asp He Leu Ala Gly Tyr Gly Ala Gly Val 
605 610 615 

gcg gga get ctt gtg gca ttc aag ate atg age ggt gag gtc ccc tec 14583 
Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu Val Pro Ser 
620 625 630 635 

acg gag gac ctg gtc aat eta ctg ccc gcc ate etc teg ccc gga gcc 14631 
Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala 
640 645 650 

etc gta gtc ggc gtg gtc tgt gca gca ata ctg cgc egg cac gtt ggc 14679 
Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His Val Gly 
655 660 665 

ccg ggc gag ggg gca gtg cag tgg atg aae egg ctg ata gee ttc gee 14727 
Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala Phe Ala 
670 675 680 

tec egg ggg aac cat gtt tec ccc acg cac tac gtg ccg gag age gat 14775 
Ser PLrg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp 
685 690 695 

gca get gcc cgc gtc act gcc ata etc age age etc act gta ace cag 14823 
Ala Ala Ala Arg Val Thr Ala He Leu Ser Ser Leu Thr val Thr Gin 
700 705 710 715 

etc ctg agg cga ctg cac cag tgg ata age teg gag tgt ace act cea 14871 
Leu Leu Arg Arg Leu His Gin Trp He Ser Ser Glu Cys Thr Thr Pro 
720 725 730 

tgc tec ggt tec tgg eta agg gac ate tgg gac tgg ata tgc gag gtg 14919 
Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp He Cys Glu Val 
735 740 745 

ttg age gac ttt aag acc tgg eta aaa get aag etc atg eca cag ctg 14967 
Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu 
750 755 760 

cct ggg ate ccc ttt gtg tec tgc cag cgc ggg tat aag ggg gtc tgg 15015 
Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp 
765 770 775 

cga ggg gac ggc ate atg cac act cgc tgc cac tgt gga get gag ate 15063 
Arg Gly Asp Gly He Met His Thr Arg Cys His Cys Gly Ala Glu He 
780 785 790 795 

act gga cat gtc aaa aac ggg acg atg agg ate gtc ggt cet agg acc 15111 
Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly Pro Arg Thr 
800 805 810 
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tgc agg aac atg tgg agt ggg acc ttc ccc att aat gcc tac acc acg 
Cys Arg Asn Met Trp Ser Gly Thr Phe Pro lie Asn Ala Tyr Thr Thr 
815 820 825 



15159 



ggc ccc tgt acc ccc ctt cct gcg ccg aac tac acg ttc gcg eta tgg 
Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp 
830 835 840 



15207 



agg gtg tct gca gag gaa tac gtg gag ata agg cag gtg ggg gac ttc 
Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe 
845 850 855 



15255 



cac tac gtg acg ggt atg act act gac aat ctt aaa tgc ccg tgc cag 
His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin 
860 865 870 875 



15303 



gtc cca teg ccc gaa ttt ttc aca gaa ttg gac ggg gtg cgc eta cat 
Val Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His 
880 885 890 



15351 



agg ttt gcg ccc ccc tgc aag ccc ttg ctg egg gag gag gta tea ttc 
Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe 
895 900 905 



15399 



aga gta gga etc cac gaa tac ccg gta ggg teg caa tta cct tgc gag 
Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu 
910 915 920 



15447 



ccc gaa ccg gac gtg gcc gtg ttg acg tec atg etc act gat ccc tee 
Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser 
925 930 935 



15495 



cat ata aca gca gag gcg gcc ggg cga agg ttg gcg agg gga tea ccc 
His He Thr Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro 
940 945 950 955 



15543 



ccc tct gtg gcc age tec teg get age cag eta tec get cca tct etc 
Pro Ser Val Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu 
960 965 970 



15591 



aag gca act tgc acc get aac cat gac tec cct gat get gag etc ata 
Lys Ala Thr Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu He 
975 980 985 



15639 



gag gee aac etc eta tgg agg cag gag atg ggc ggc aac ate acc agg 
Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He Thr Arg 
990 995 1000 



15687 



gtt gag tea gaa aac aaa gtg gtg att ctg gac tec ttc gat ccg ctt 
Val Glu Ser Glu Asn Lys Val Val He Leu Asp Ser Phe Asp Pro Leu 
1005 1010 1015 



15735 



gtg gcg gag gag gac gag egg gag ate tec gta ccc gca gaa ate ctg 
Val Ala Glu Glu Asp Glu Arg Glu. He Ser Val Pro Ala Glu He Leu 
1020 1025 1030 1035 



15783 



egg aag tct egg aga ttc gcc cag gcc ctg ccc gtt tgg gcg egg ccg 
Arg Lys Ser Arg Arg Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro 
1040 1045 1050 



15831 
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gac tat aac ccc ccg eta gtg gag acg tgg aaa aag ccc gac tac gaa 15879 
Asp Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu 
1055 1060 1065 

cca cct gtg gtc cat ggc tgc ccg ctt oca cct cca aag tec cct cot 15927 
Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro 
1070 1075 1080 

gtg cct ccg cct egg aag aag egg acg gtg gtc etc act gaa tea acc 15975 
Val Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr 
1085 1090 1095 

eta tct act gcc ttg gcc gag etc gcc acc aga age ttt ggc age tec 16023 
Leu Ser Thr Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser 
1100 1105 1110 1115 

tea act tec ggc att acg ggc gac aat acg aca aca tec tct gag ccc 16071 
Ser Thr Ser Gly lie Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro 
1120 1125 1130 

gcc cct tct ggc tgc ccc ccc gac tec gac get gag tec tat tec tec 16119 
Ala Pro Ser Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser 
1135 1140 1145 

atg ccc ccc ctg gag ggg gag cct ggg gat ccg gat ctt age gac ggg 16167 
Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly 
1150 1155 1160 

tea tgg tea acg gtc agt agt gag gcc aac gcg gag gat gtc gtg tgc 16215 
Ser Trp Ser Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys 
1165 1170 1175 

tgc tea atg tct tac tct tgg aca ggc gca etc gtc acc ccg tgc gcc 16263 
Cys Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala 
1180 1185 1190 1195 

gcg gaa gaa cag aaa ctg ccc ate aat gca eta age aac teg ttg eta 16311 
Ala Glu Glu Gin Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu 
1200 1205 1210 

cgt cac cac aat ttg gtg tat tec acc acc tea cge agt get tgc caa 16359 
Arg His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin 
1215 1220 1225 

agg cag aag aaa gtc aca ttt gac aga ctg caa gtt ctg gac age cat 16407 
Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His 
1230 1235 1240 

tac cag gac gta etc aag gag gtt aaa gca gcg gcg tea aaa gtg aag 16455 
Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys 
1245 1250 1255 

get aac ttg eta tec gta gag gaa get tgc age ctg acg ccc cca cac 16503 
Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His 
1260 1265 1270 1275 

tea gcc aaa tec aag ttt ggt tat ggg gca aaa gac gtc cgt tgc cat 16551 
Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His 
1280 1285 1290 
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gcc aga aag gcc gta acc cac ate aac tec gtg tgg aaa gac ctt ctg 
Ala Arg Lys Ala Val Thr His lie Asn Ser Val Trp Lys Asp Leu Leu 
1295 1300 1305 



16599 



gaa gac aat gta aca oca ata gac act acc ate atg get aag aac gag 
Glu Asp Asn Val Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu 
1310 1315 1320 



16647 



gtt ttc tgc gtt cag cct gag aag ggg ggt cgt aag cca get cgt etc 
Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu 
1325 1330 1335 



16695 



ate gtg ttc ccc gat ctg ggc gtg cgc gtg tgc gaa aag atg get ttg 
He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu 
1340 1345 1350 1355 



16743 



tac gac gtg gtt aca aag etc ccc ttg gcc gtg atg gga age tec tac 
Tyr Asp Val Val Thr Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr 
1360 1365 1370 



16791 



gga ttc caa tac tea cca gga cag egg gtt gaa ttc etc gtg caa gcg 
Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala 
1375 1380 1385 



16839 



tgg aag tec aag aaa acc cca atg ggg ttc teg tat gat acc cgc tgc 
Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys 
1390 1395 1400 



16887 



ttt gac tec aca gtc act gag age gac ate cgt aeg gag gag gca ate 
Phe Asp Ser Thr Val Thr Glu Ser Asp He Arg Thr Glu Glu Ala He 
1405 1410 1415 



16935 



tac caa tgt tgt gac etc gac ccc caa gcc cgc gtg gcc ate aag tec 
Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala He Lys Ser 
1420 1425 1430 1435 



16983 



etc acc gag agg ctt tat gtt ggg ggc cct ctt acc aat tea agg ggg 
Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly 
1440 1445 1450 



17031 



gag aac tgc ggc tat cgc agg tgc cgc gcg age ggc gta ctg aca act 
Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr 
1455 1460 1465 



17079 



age tgt ggt aac ace etc act tgc tac ate aag gee egg gca gcc tgt 
Ser Cys Gly Asn Thr Leu Thr Cys Tyr He Lys Ala Arg Ala Ala Cys 
1470 1475 1480 



17127 



cga gcc gca ggg etc cag gac tgc acc atg etc gtg tgt ggc gac gac 
Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp 
1485 1490 1495 



17175 



tta gtc gtt ate tgt gaa age gcg ggg gtc cag gag gac gcg gcg age 
Leu Val Val He Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser 
1500 1505 1510 1515 



17223 



ctg aga gcc ttc aeg gag get atg acc agg tac tec gcc ccc cct ggg 
Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly 
1520 1525 1530 



17271 
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gac ccc cca caa cca gaa tac gac ttg gag etc ata aca tea tgc tec 17319 
Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser 
1535 1540 1545 

tec aac gtg tea gtc gcc cae gac ggc get gga aag agg gte tac tac 17367 
Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr 
1550 1555 1560 

etc acc cgt gac cct aca acc ccc etc gcg aga get gcg tgg gag aca 17415 
Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr 
1565, 1570 1575 

gca aga cac act cca gtc aat tec tgg eta ggc aac ata ate atg ttt 17463 
Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn lie He Met Phe 
1580 1585 1590 1595 

gcc ccc aca ctg tgg gcg agg atg ata ctg atg acc cat ttc ttt age 17511 
Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His Phe Phe Ser 
1600 1605 1610 

gtc ctt ata gcc agg gac cag ctt gaa cag gcc etc gat tgc gag ate 17559 
Val Leu He Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He 
1615 1620 1625 

tac ggg gcc tgc tac tec ata gaa cca ctg gat eta cct cca ate att 17607 
Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Pro He He 
1630 1635 1640 

caa aga etc cat ggc etc age gca ttt tea etc cae agt tac tct cca 17655 
Gin Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro 
1645 1650 1655 

ggt gaa ate aat agg gtg gcc gca tgc etc aga aaa ctt ggg gta ccg 17703 
Gly Glu He Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro 
1660 1665 1670 1675 

ccc ttg cga get tgg aga cac egg gcc egg age gtc cgc get agg ctt 17751 
Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu 
1680 1685 1690 

ctg gcc aga gga ggc agg get gcc ata tgt ggc aag tac etc ttc aac 17799 
Leu Ala Arg Gly Gly Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn 
1695 1700 1705 

tgg gca gta aga aca aag etc aaa etc act cca ata gcg gcc get ggc 17847 
Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly 
1710 1715 1720 

cag ctg gac ttg tec ggc tgg ttc acg get ggc tac age ggg gga gac 17895 
Gin Leu Asp Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp 
1725 1730 1735 

att tat cac age gtg tct cat gcc egg ccc cgc tgg ate tgg ttt tgc 17943 
He Tyr His Ser Val Ser His Ala Arg Pro Arg Trp He Trp Phe Cys 
1740 1745 1750 1755 

eta etc ctg ctt get gca ggg gta ggc ate tac etc etc ccc aac cga 17991 
Leu Leu Leu Leu Ala Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg 
1760 1765 1770 
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atg age acg aat cct aaa cct caa aga aag acc aaa cgt aac acc aac 18039 
Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1775 1780 1785 

egg egg eeg cag gac gtc aag ttc ecg ggt ggc ggt cag ate gtt ggt 18087 
Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
1790 1795 1800 

gga gtt tac ttg ttg ccg cgc agg ggc cct aga ttg ggt gtg cgc gcg 18135 
Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
1805 1810 1815 

acg aga aag act tec gag egg teg caa cct ega ggt aga cgt cag cct 18183 
Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
1820 1825 1830 1835 

ate ccc aag get cgt egg cec gag ggc agg acc tgg get cag ccc ggg 18231 
lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
1840 1845 1850 

tac cct tgg ccc etc tat ggc aat gag ggc tgc ggg tgg gcg gga tgg 18279 
Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
1855 1860 1865 

etc ctg tet ccc cgt ggc tct egg cct age tgg ggc ccc aca gac cec 18327 
Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
1870 1875 1880 

egg cgt agg teg cgc aat ttg ggt aag taatagtega ctttgttccc 18374 
Arg Arg Arg Ser Arg Asn Leu Gly Lys 
1885 1890 

actgtacttt tagctegtae aaaataeaat ataettttea tttctccgta aacaacatgt 18434 

tttcccatgt aatateettt tetattttte gttcegttac caactttaca catactttat 18494 

atagctattc acttctatac actaaaaaac taagacaatt ttaattttgc tgcctgccat 18554 

atttcaattt gttataaatt cctataattt atcctattag tagetaaaaa aagatgaatg 18614 

tgaatcgaat cctaagagaa ttggatctga tecaeaggac gggtgtggtc gccatgatcg 18674 

cgtagtegat agtggcteea agtagegaag egagcaggae tgggeggegg ceaaageggt 18734 

eggacagtgc teegagaacg ggtgcgeata gaaattgeat caaegcatat agegetagca 18794 

geacgecata gtgaetggcg atgctgtcgg aatggaegat ateecgcaag aggcccggea 18854 

gtaceggeat aaccaagcet atgeetaeag eatceagggt gacggtgeeg aggatgacga 18914 

tgagcgcatt gttagatttc atacacggtg cctgactgcg ttagcaattt aactgtgata 18974 

aactaccgca ttaaagcttt ttctttccaa tttttttttt ttcgtcatta taaaaatcat 19034 

tacgaccgag attccegggt aataaetgat ataattaaat tgaagctcta atttgtgagt 19094 

ttagtataca tgeatttact tataataeag ttttttagtt ttgctggccg eatettctca 19154 

aatatgcttc ecagcetget tttctgtaac gttcacccte taccttagea teccttccet 19214 
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ttgcaaatag 


tcctcttcca 


acaataataa 


tgtcagatcc 


tgtagagacc 


acatcatcca 


19274 


cggttctata 


ctgttgaccc 


aatgcgtctc 


ccttgtcatc 


taaacccaca 


ccgggtgtca 


19334 


taatcaacca 


atcgtaacct 


tcatctcttc 


cacccatgtc 


tctttgagca 


ataaagccga 


19394 


taacaaaatc 


tttgtcgctc 


ttcgcaatgt 


caacagtacc 


cttagtatat 


tctccagtag 


19454 


atagggagcc 


cttgcatgac 


aattctgcta 


acatcaaaag 


gcctctaggt 


tcctttgtta 


19514 


cttcttctgc 


cgcctgcttc 


aaaccgctaa 


caatacctgg 


gcccaccaca ccgtgtgcat 


19574 


tcgtaatgtc 


tgcccattct 


gctattctgt 


atacacccgc 


agagtactgc 


aatttgactg 


19634 


tattaccaat 


gtcagcaaat 


tttctgtctt 


cgaagagtaa 


aaaattgtac 


ttggcggata 


19694 


atgcctttag 


cggcttaact 


gtgccctcca 


tggaaaaatc 


agtcaagata 


tccacatgtg 


19754 


tttttagtaa 


acaaattttg 


ggacctaatg 


cttcaactaa 


ctccagtaat 


tccttggtgg 


19814 


tacgaacatc 


caatgaagca 


cacaagtttg 


tttgcttttc 


gtgcatgata 


ttaaatagct 


19874 


tggcagcaac 


aggactagga 


tgagtagcag 


cacgttcctt 


atatgtagct 


ttcgacatga 


19934 


tttatcttcg 


tttcctgcag 


gtttttgttc 


tgtgcagttg 


ggttaagaat 


actgggcaat 


19994 


ttcatgtttc 


ttcaacacta 


catatgcgta 


tatataccaa 


tctaagtctg 


tgctccttcc 


20054 


ttcgttcttc 


cttctgttcg 


gagattaccg 


aatcaaaaaa 


atttcaagga aaccgaaatc 


20114 


aaaaaaaaga 


ataaaaaaaa 


aatgatgaat 


tgaaaagctt 


atcgat 




20160 



<210> 13 
<211> 1892 
<212> PRT 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: 
pd. delta. NS3NS5.pj .corel21 

<400> 13 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
15 10 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
20 25 

Gly He Asp Pro Asn He Arg Thr Gly Val Arg 
35 40 

Ser Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe 
50 55 

Cys Ser Gly Gly Ala Tyr Asp He He He Cys 
65 70 75 

Thr Asp Ala Thr Ser He Leu Gly He Gly Thr 

102 



Leu Val Leu Asn Pro 

15 

Met Ser Lys Ala His 
30 

Thr He Thr Thr Gly 
45 

Leu Ala Asp Gly Gly 
60 

Asp Glu Cys His Ser 
80 

Val Leu Asp Gin Ala 
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85 



90 



95 



Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro 
100 105 110 

Gly Ser Val Thr Val Pro His Pro Asn lie Glu Glu Val Ala Leu Ser 
115 120 125 

Thr Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala lie Pro Leu Glu Val 
130 135 140 

lie Lys Gly Gly Arg His Leu lie Phe Cys His Ser Lys Lys Lys Cys 
145 150 155 160 

Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly lie Asn Ala Val Ala 
165 170 175 

Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val 
180 185 190 

Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 
195 200 205 

Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
210 215 220 

Ser Leu Asp Pro Thr Phe Thr He Glu Thr He Thr Leu Pro Gin Asp 
225 230 235 240 

Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro 
245 250 255 

Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe 
260 265 270 

Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr 
275 280 285 

Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn 

290 295 300 

Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly 
305 310 315 320 

Val Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr 
325 330 335 

Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr 
340 345 350 

Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp 

355 360 365 

Lys Cys Leu He Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu 
370 375 380 



Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu He Thr Leu Thr His Pro 
385 390 395 400 
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Val Thr Lys Tyr lie Met Thr Cys Met Ser Ala Asp Leu Glu Val Val 
405 410 415 

Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
420 425 430 

Tyr Cys Leu Ser Thr Gly Cys Val Val lie Val Gly Arg Val Val Leu 
435 440 445 

Ser Gly Lys Pro Ala lie lie Pro Asp Arg Glu Val Leu Tyr Arg Glu 
450 455 460 

Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr lie Glu Gin 
465 470 475 480 

Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu 
485 490 495 

Gin Thr Ala Ser Arg Gin Ala Glu Val lie Ala Pro Ala Val Gin Thr 
500 . 505 510 

Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe 
515 520 525 

lie Ser Gly lie Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 

530 535 540 

Pro Ala lie Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro 
545 550 555 560 

Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn lie Leu Gly Gly Trp Val 
565 570 575 

Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 
580 585 590 

Gly Leu Ala Gly Ala Ala lie Gly Ser Val Gly Leu Gly Lys Val Leu 
595 600 605 

He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 
610 615 620 

Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 
625 630 635 640 

Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val 
645 650 655 

Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
660 665 670 

Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 
675 680 685 

Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
690 695 700 



Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu 
705 710 715 720 
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His Gin Trp lie Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp 
725 730 735 

Leu Arg Asp He Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys 
740 745 750 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe 
755 760 765 

Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He 
770 775 780 

Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His Val Lys 
785 790 795 800 

Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
805 810 815 

Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
820 825 830 

Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser Ala Glu 
835 840 845 

Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe His Tyr Val Thr Gly 
850 855 860 

Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser Pro Glu 
865 870 875 880 

Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro 
885 890 895 

Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His 
900 905 910 

Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val 
915 920 925 

Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu 
930 935 940 

Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala Ser 
945 950 955 960 

Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 
965 970 975 

Ala Asn His Asp Ser Pro Asp Ala Glu Leu He Glu Ala Asn Leu Leu 
980 985 990 

Trp Arg Gin Glu Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn 
995 1000 1005 

Lys Val Val He Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp 
1010 1015 1020 

Glu Arg Glu He Ser Val Pro Ala Glu He Leu Arg Lys Ser Arg Arg 
025 1030 1035 1040 
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Phe Ala Gin Ala Leu 
1045 

Leu Val Glu Thr Trp 
1060 

Gly Cys Pro Leu Pro 
1075 

Lys Lys Arg Thr Val 
1090 



Pro Val Trp Ala Arg 
1050 

Lys Lys Pro Asp Tyr 
1065 

Pro Pro Lys Ser Pro 
1080 

Val Leu Thr Glu Ser 
1095 



Pro Asp Tyr Asn Pro Pro 
1055 

Glu Pro Pro Val Val His 

1070 

Pro Val Pro Pro Pro Arg 
1085 

Thr Leu Ser Thr Ala Leu 
1100 



Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser Gly He 
105 1110 1115 1120 

Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys 
1125 1130 1135 



Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu 
1140 1145 1150 

Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val 
1155 1160 1165 

Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met Ser Tyr 
1170 1175 1180 

Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys 
185 1190 1195 1200 

Leu Pro He Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu 
1205 1210 1215 

Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val 
1220 1225 1230 

Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu 
1235 1240 1245 



Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser 
1250 1255 1260 

Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys 
265 1270 1275 1280 

Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val 
1285 1290 1295 

Thr His He Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr 
1300 1305 1310 

Pro lie Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin 
1315 1320 1325 

Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp 
1330 1335 1340 

Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr 
345 1350 1355 1360 
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Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser 
1365 1370 1375 

Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys 
1380 1385 1390 

Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 
1395 1400 1405 

Thr Glu Ser Asp lie Arg Thr Glu Glu Ala He Tyr Gin Cys Cys Asp 
1410 1415 1420 

Leu Asp Pro Gin Ala Arg Val Ala He Lys Ser Leu Thr Glu Arg Leu 
425 1430 1435 1440 

Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr 
1445 1450 1455 

Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr 
1460 1465 1470 

Leu Thr Cys Tyr He Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu 
1475 1480 1485 

Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val lie Cys 
1490 1495 1500 

Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr 
505 1510 1515 1520 

Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro 
1525 1530 1535 

Glu Tyr Asp Leu Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val 
1540 1545 1550 

Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro 
1555 1560 1565 

Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro 
1570 1575 1580 

Val Asn Ser Trp Leu Gly Asn He He Met Phe Ala Pro Thr Leu Trp 
585 1590 1595 1600 

Ala Arg Met He Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg 
1605 1610 1615 

Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He Tyr Gly Ala Cys Tyr 
1620 1625 1630 

Ser He Glu Pro Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly 
1635 1640 1645 

Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg 
1650 1655 1660 

Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp 
665 1670 1675 1680 



107 



wo 01/38360 



PCT/USOO/32326 



Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly 
1685 1690 1695 

Arg Ala Ala lie Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr 
1700 1705 1710 

Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly Gin Leu Asp Leu Ser 
1715 1720 1725 

Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val 
1730 1735 1740 

Ser His Ala Arg Pro Arg Trp He Trp Phe Cys Leu Leu Leu Leu Ala 
745 1750 1755 1760 

Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg Met Ser Thr Asn Pro 
1765 1770 1775 

Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp 
1780 1785 1790 

Val Lys Phe Pro Gly Gly Gly Gin He Val Gly Gly Val Tyr Leu Leu 
1795 1800 1805 

Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser 
1810 1815 1820 

Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro He Pro Lys Ala Arg 
825 1830 1835 1840 

Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly Tyr Pro Trp Pro Leu 
1845 1850 1855 

Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg 
1860 1865 1870 

Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg 
1875 1880 1885 

Asn Leu Gly Lys 
1890 



<210> 14 
<211> 20316 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
pd. delta. NS3NS5.pj .corel73 

<220> 
<221> CDS 

<222> (12679) (18510) 
<400> 14 

atcgatccta ccccttgcgc taaagaagta tatgtgccta ctaacgcttg tctttgtctc 60 
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tgtcactaaa 


cactggatta 


ttactcccag 


atacttattt 


tggactaatt 


taaatgattt 


120 


cggatcaacg 


ttcttaatat 


cgctgaatct 


tccacaattg 


atgaaagtag 


ctaggaagag 


180 


gaattggtat 


aaagtttttg 


tttttgtaaa 


tctcgaagta 


tactcaaacg 


aatttagtat 


240 


tttctcagtg 


atctcccaga 


tgctttcacc 


ctcacttaga 


agtgctttaa 


gcattttttt 


300 


actgtggcta 


tttcccttat 


ctgcttcttc 


cgatgattcg 


aactgtaatt 


gcaaactact 


360 


tacaatatca 


gtgatatcag 


attgatgttt 


ttgtccatag 


taaggaataa 


ttgtaaattc 


420 


ccaagcagga 


atcaatttct 


ttaatgaggc 


ttccagaatt 


gttgcttttt 


gcgtcttgta 


480 


tttaaactgg 


agtgatttat 


tgacaatatc 


gaaactcagc 


gaattgctta 


tgatagtatt 


540 


atagctcatg 


aatgtggctc 


tcttgattgc 


tgttccgtta 


tgtgtaatca 


tccaacataa 


600 


ataggttagt 


tcagcagcac 


ataatgctat 


tttctcacct 


gaaggtcttt 


caaacctttc 


660 


cacaaactga 


cgaacaagca 


ccttaggtgg 


tgttttacat 


aatatatcaa 


attgtggcat 


720 


gcttagcgcc 


gatcttgtgt 


gcaattgata 


tctagtttca 


actactctat 


ttatcttgta 


780 


tcttgcagta 


ttcaaacacg 


ctaactcgaa 


aaactaactt 


taattgtcct 


gtttgtctcg 


840 


cgttctttcg 


aaaaatgcac 


cggccgcgca 


ttatttgtac 


tgcgaaaata 


attggtactg 


900 


cggtatcttc 


atttcatatt 


ttaaaaatgc 


acctttgctg 


cttttcctta 


atttttagac 


960 


ggcccgcagg 


ttcgttttgc 


ggtactatct 


tgtgataaaa 


agttgttttg 


acatgtgatc 


1020 


tgcacagatt 


ttataatgta 


ataagcaaga 


atacattatc 


aaacgaacaa 


tactggtaaa 


1080 


agaaaaccaa 


aatggacgac 


attgaaacag 


ccaagaatct 


gacggtaaaa 


gcacgtacag 


1140 


cttatagcgt 


ctgggatgta 


tgtcggctgt 


ttattgaaat 


gattgctcct 


gatgtagata 


1200 


ttgatataga 


gagtaaacgt 


aagtctgatg 


agctactctt 


tccaggatat 


gtcataaggc 


1260 


ccatggaatc 


tctcacaacc 


ggtaggccgt 


atggtcttga 


ttctagcgca 


gaagattcca 


1320 


gcgtatcttc 


tgactccagt 


gctgaggtaa 


ttttgcctgc 


tgcgaagatg 


gttaaggaaa 


1380 


ggtttgattc 


gattggaaat 


ggtatgctct 


cttcacaaga 


agcaagtcag 


gctgccatag 


1440 


atttgatgct 


acagaataac 


aagctgttag 


acaatagaaa 


gcaactatac 


aaatctattg 


1500 


ctataataat 


aggaagattg 


cccgagaaag 


acaagaagag 


agctaccgaa 


atgctcatga 


1560 


gaaaaatgga 


ttgtacacag 


ttattagtcc 


caccagctcc 


aacggaagaa 


gatgttatga 


1620 


agctcgtaag 


cgtcgttacc 


caattgctta 


ctttagttcc 


accagatcgt 


caagctgctt 


1680 


taataggtga 


tttattcatc 


ccggaatctc 


taaaggatat 


attcaatagt 


ttcaatgaac 


1740 


tggcggcaga 


gaatcgttta 


cagcaaaaaa 


agagtgagtt 


ggaaggaagg 


actgaagtga 


1800 


accatgctaa 


tacaaatgaa 


gaagttccct 


ccaggcgaac 


aagaagtaga 


gacacaaatg 


1860 
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caagaggagc 


atataaatta 


caaaacacca 


tcactgaggg 


ccctaaagcg 


gttcccacga 


1920 


aaaaaaggag 


agtagcaacg 


agggtaaggg 


gcagaaaatc 


acgtaatact 


tctagggtat 


1980 


gatccaatat 


caaaggaaat 


gatagcattg 


aaggatgaga 


ctaatccaat 


tgaggagtgg 


2040 


cagcatatag 


aacagctaaa 


gggtagtgct 


gaaggaagca 


tacgataccc 


cgcatggaat 


2100 


gggataatat 


cacaggaggt 


actagactac 


ctttcatcct 


acataaatag 


acgcatataa 


2160 


gtacgcattt 


aagcataaac 


acgcactatg 


ccgttcttct 


catgtatata 


tatatacagg 


2220 


caacacgcag 


atataggtgc 


gacgtgaaca 


gtgagctgta 


tgtgcgcagc 


tcgcgttgca 


2280 


ttttcggaag 


cgctcgtttt 


cggaaacgct 


ttgaagttcc 


tattccgaag 


ttcctattct 


2340 


Ctagaaagta 


taggaacttc 


agagcgcttt 


tgaaaaccaa 


aagcgctctg 


aagacgcact 


2400 


ttcaaaaaac 


caaaaacgca 


ccggactgta 


acgagctact 


aaaatattgc 


gaataccgct 


2460 


tccacaaaca 


ttgctcaaaa 


gtatctcttt 


gctatatatc 


tctgtgctat 


atccctatat 


2520 


aacctaccca 


tccacctttc 


gctccttgaa 


cttgcatcta 


aactcgacct 


ctacatcaac 


2580 


aggcttccaa 


tgctcttcaa 


attttactgt 


caagtagacc 


catacggctg 


taatatgctg 


2640 


ctcttcataa 


tgtaagctta 


tctttatcga 


atcgtgtgaa 


aaactactac 


cgcgataaac 


2700 


ctttacggtt 


ccctgagatt 


gaattagttc 


ctttagtata 


tgatacaaga 


cacttttgaa 


2760 


ctttgtacga 


cgaattttga 


ggttcgccat 


cctctggcta 


tttccaatta 


tcctgtcggc 


2820 


tattatctcc 


gcctcagttt 


gatcttccgc 


ttcagactgc 


catttttcac 


ataatgaatc 


2880 


tatttcaccc 


cacaatcctt 


catccgcctc 


cgcatcttgt 


tccgttaaac 


tattgacttc 


2940 


atgttgtaca 


ttgtttagtt 


cacgagaagg 


gtcctcttca 


ggcggtagct 


cctgatctcc 


3000 


tatatgacct 


ttatcctgtt 


ctctttccac 


aaacttagaa 


atgtattcat 


gaattatgga 


3060 


gcacctaata 


acattcttca 


aggcggagaa 


gtttgggcca 


gatgcccaat 


atgcttgaca 


3120 


tgaaaacgtg 


agaatgaatt 


tagtattatt 


gtgatattct 


gaggcaattt 


tattataatc 


3180 


tcgaagataa 


gagaagaatg 


cagtgacctt 


tgtattgaca 


aatggagatt 


ccatgtatct 


3240 


aaaaaatacg 


cctttaggcc 


ttctgatacc 


ctttcccctg 


cggtttagcg 


tgccttttac 


3300 


attaatatct 


aaaccctctc 


cgatggtggc 


ctttaactga 


ctaataaatg 


caaccgatat 


3360 


aaactgtgat 


aattctgggt 


gatttatgat 


tcgatcgaca 


attgtattgt 


acactagtgc 


3420 


aggatcaggc 


caatccagtt 


ctttttcaat 


taccggtgtg 


tcgtctgtat 


tcagtacatg 


3480 


tccaacaaat 


gcaaatgcta 


acgttttgta 


tttcttataa 


ttgtcaggaa 


ctggaaaagt 


3540 


cccccttgtc 


gtctcgatta 


cacacctact 


ttcatcgtac 


accataggtt 


ggaagtgctg 


3600 


cataatacat 


tgcttaatac 


a age a age ag 


tctctcgcca 


ttcatatttc 


agttattttc 


3660 
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cattacagct 


gatgtcattg 


tatatcagcg 


ctgtaaaaat 


ctatctgtta 


cagaaggttt 


3720 


tcgcggtttt 


tataaacaaa 


actttcgtta 


cgaaatcgag 


caatcacccc 


agctgcgtat 


3780 


ttggaaattc 


gggaaaaagt 


agagcaacgc 


gagttgcatt 


ttttacacca 


taatgcatga 


3840 


ttaacttcga 


gaagggatta 


aggctaattt 


cactagtatg 


tttcaaaaac 


ctcaatctgt 


3900 


ccattgaatg 


ccttataaaa 


cagctataga 


ttgcatagaa 


gagttagcta 


ctcaatgctt 


3960 


tttgtcaaag 


cttactgatg 


atgatgtgtc 


tactttcagg 


cgggtctgta 


gtaaggagaa 


4020 


tgacattata 


aagctggcac 


ttagaattcc 


acggactata 


gactatacta 


gtatactccg 


4080 


tctactgtac 


gatacacttc 


cgctcaggtc 


cttgtccttt 


aacgaggcct 


taccactctt 


4140 


ttgttactct 


attgatccag 


ctcagcaaag 


gcagtgtgat 


ctaagattct 


atcttcgcga 


4200 


tgtagtaaaa 


ctagctagac 


cgagaaagag 


actagaaatg 


caaaaggcac 


ttctacaatg 


4260 


gctgccatca 


ttattatccg 


atgtgacgct 


gcattttttt 


tttttttttt 


tttttttttt 


4320 


tttttttttt 


tttttttttt 


ttttttggta 


caaatatcat 


aaaaaaagag 


aatcttttta 


4380 


agcaaggatt 


ttcttaactt 


cttcggcgac 


agcatcaccg 


acttcggtgg 


tactgttgga 


4440 


accacctaaa 


tcaccagttc 


tgatacctgc 


atccaaaacc 


tttttaactg 


catcttcaat 


4500 


ggctttacct 


tcttcaggca 


agttcaatga 


caatttcaac 


atcattgcag 


cagacaagat 


4560 


agtggcgata 


gggttgacct 


tattctttgg 


caaatctgga 


gcggaaccat 


ggcatggttc 


4620 


gtacaaacca 


aatgcggtgt 


tcttgtctgg 


caaagaggcc 


aaggacgcag 


atggcaacaa 


4680 


acccaaggag 


cctgggataa 


cggaggcttc 


atcggagatg 


atatcaccaa 


acatgttgct 


4740 


ggtgattata 


ataccattta 


ggtgggttgg 


gttcttaact 


aggatcatgg 


cggcagaatc 


4800 


aatcaattga 


tgttgaactt 


tcaatgtagg 


gaattcgttc 


ttgatggttt 


cctccacagt 


4860 


ttttctccat 


aatcttgaag 


aggccaaaac 


attagcttta 


tccaaggacc 


aaataggcaa 


4920 


tggtggctca 


tgttgtaggg 


ccatgaaagc 


ggccattctt 


gtgattcttt 


gcacttctgg 


4980 


aacggtgtat 


tgttcactat 


cccaagcgac 


accatcacca 


tcgtcttcct 


ttctcttacc 


5040 


aaagtaaata 


cctcccacta 


attctctaac 


aacaacgaag 


tcagtacctt 


tagcaaattg 


5100 


tggcttgatt 


ggagataagt 


ctaaaagaga 


gtcggatgca 


aagttacatg 


gtcttaagtt 


5160 


ggcgtacaat 


tgaagttctt 


tacggatttt 


tagtaaacct 


tgttcaggtc 


taacactacc 


5220 


ggtaccccat 


ttaggaccac 


ccacagcacc 


taacaaaacg 


gcatcagcct 


tcttggaggc 


5280 


ttccagcgcc 


tcatctggaa 


gtggaacacc 


tgtagcatcg 


atagcagcac 


caccaattaa 


5340 


atgattttcg 


aaatcgaact 


tgacattgga 


acgaacatca 


gaaatagctt 


taagaacctt 


5400 


aatggcttcg 


gctgtgattt 


cttgaccaac 


gtggtcacct 


ggcaaaacga 


cgatcttctt 


5460 
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aggggcagac 


attacaatgg 


tatatccttg 


aaatatatat 


aaaaaaaaaa 


aaaaaaaaaa 


5520 


aaaaaaaaaa 


atgcagcttc 


tcaatgatat 


tcgaatacgc 


tttgaggaga 


tacagcctaa 


5580 


tatccgacaa 


actgttttac 


agatttacga 


tcgtacttgt 


tacccatcat 


tgaattttga 


5640 


acatccgaac 


ctgggagttt 


tccctgaaac 


agatagtata 


tttgaacctg 


tataataata 


5700 


tatagtctag 


cgctttacgg 


aagacaatgt 


atgtatttcg 


gttcctggag 


aaactattgc 


5760 


atctattgca 


taggtaatct 


tgcacgtcgc 


atccccggtt 


cattttctgc 


gtttccatct 


5820 


tgcacttcaa 


tagcatatct 


ttgttaacga 


agcatctgtg 


cttcattttg 


tagaacaaaa 


5880 


atgcaacgcg 


agagcgctaa 


tttttcaaac 


aaagaatctg 


agctgcattt 


ttacagaaca 


5940 


gaaatgcaac 


gcgaaagcgc 


tattttacca 


acgaagaatc 


tgtgcttcat 


ttttgtaaaa 


6000 


caaaaatgca 


acgcgagagc 


gctaattttt 


caaacaaaga 


atctgagctg 


catttttaca 


6060 


gaacagaaat 


gcaacgcgag 


agcgctattt 


taccaacaaa 


gaatctatac 


ttcttttttg 


6120 


ttctacaaaa 


atgcatcccg 


agagcgctat 


ttttctaaca 


aagcatctta 


gattactttt 


6180 


tttctccttt 


gtgcgctcta 


taatgcagtc 


tcttgataac 


tttttgcact 


gtaggtccgt 


6240 


taaggttaga 


agaaggctac 


tttggtgtct 


attttctctt 


ccataaaaaa 


agcctgactc 


6300 


cacttcccgc 


gtttactgat 


tactagcgaa 


gctgcgggtg 


cattttttca 


agataaaggc 


6360 


atccccgatt 


atattctata 


ccgatgtgga 


ttgcgcatac 


tttgtgaaca 


gaaagtgata 


6420 


gcgttgatga 


ttcttcattg 


gtcagaaaat 


tatgaacggt 


ttcttctatt 


ttgtctctat 


6480 


atactacgta 


taggaaatgt 


ttacattttc 


gtattgtttt 


cgattcactc 


tatgaatagt 


6540 


tcttactaca 


atttttttgt 


ctaaagagta 


atactagaga 


taaacataaa 


aaatgtagag 


6600 


gtcgagttta 


gatgcaagtt 


caaggagcga 


aaggtggatg 


ggtaggttat 


atagggatat 


6660 


agcacagaga 


tatatagcaa 


agagatactt 


ttgagcaatg 


tttgtggaag 


cggtattcgc 


6720 


aatattttag 


tagctcgtta 


cagtccggtg 


cgtttttggt 


tttttgaaag 


tgcgtcttca 


6780 


gagcgctttt 


ggttttcaaa 


agcgctctga 


agttcctata 


ctttctagag 


aataggaact 


6840 


tcggaatagg 


aacttcaaag 


cgtttccgaa 


aacgagcgct 


tccgaaaatg 


caacgcgagc 


6900 


tgcgcacata 


cagctcactg 


ttcacgtcgc 


acctatatct 


gcgtgttgcc 


tgtatatata 


6960 


tatacatgag 


aagaacggca 


tagtgcgtgt 


ttatgcttaa 


atgcgtactt 


atatgcgtct 


7020 


atttatgtag 


gatgaaaggt 


agtctagtac 


ctcctgtgat 


attatcccat 


tccatgcggg 


7080 


gtatcgtatg 


cttccttcag 


cactaccctt 


tagctgttct 


atatgctgcc 


actcctcaat 


7140 


tggattagtc 


tcatccttca 


atgctatcat 


ttcctttgat 


attggatcat 


atgcatagta 


7200 


ccgagaaact 


agtgcgaagt 


agtgatcagg 


tattgctgtt 


atctgatgag 


tatacgttgt 


7260 
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cctggccacg 


gcagaagcac 


gcttatcgct 


ccaatttccc 


acaacattag 


tcaactccgt 


7320 


taggcccttc 


attgaaagaa 


atgaggtcat 


caaatgtctt 


ccaatgtgag 


attttgggcc 


7380 


attttttata 


gcaaagattg 


aataaggcgc 


atttttcttc 


aaagctttat 


tgtacgatct 


7440 


gactaagtta 


tcttttaata 


attggtattc 


ctgtttattg 


cttgaagaat 


tgccggtcct 


7500 


atttactcgt 


tttaggactg 


gttcagaatt 


cctcaaaaat 


tcatccaaat 


atacaagtgg 


7560 


atcgatgata 


agctgtcaaa 


catgagaatt 


cttgaagacg 


aaagggcctc 


gtgatacgcc 


7620 


tatttttata 


ggttaatgtc 


atgataataa 


tggtttctta 


gacgtcaggt 


ggcacttttc 


7680 


ggggaaatgt 


gcgcggaacc 


cctatttgtt 


tatttttcta 


aatacattca 


aatatgtatc 


7740 


cgctcatgag 


acaataaccc 


tgataaatgc 


ttcaataata 


ttgaaaaagg 


aagagtatga 


7800 


gtattcaaca 


tttccgtgtc 


gcccttattc 


ccttttttgc 


ggcattttgc 


cttcctgttt 


7860 


ttgctcaccc 


agaaacgctg 


gtgaaagtaa 


aagatgctga 


agatcagttg 


ggtgcacgag 


7920 


tgggttacat 


cgaactggat 


ctcaacagcg 


gtaagatcct 


tgagagtttt 


cgccccgaag 


7980 


aacgttttcc 


aatgatgagc 


acttttaaag 


ttctgctatg 


tggcgcggta 


ttatcccgtg 


8040 


ttgacgccgg 


gcaagagcaa 


ctcggtcgcc 


gcatacacta 


ttctcagaat 


gacttggttg 


8100 


agtactcacc 


agtcacagaa 


aagcatctta 


cggatggcat 


gacagtaaga 


gaattatgca 


8160 


gtgctgccat 


aaccatgagt 


gataacactg 


cggccaactt 


acttctgaca 


acgatcggag 


8220 


gaccgaagga 


gctaaccgct 


tttttgcaca 


acatggggga 


tcatgtaact 


cgccttgatc 


8280 


9ttgggaacc 


ggagctgaat 


gaagccatac 


caaacgacga 


gcgtgacacc 


acgatgcctg 


8340 


cagcaatggc 


aacaacgttg 


cgcaaactat 


taactggcga 


actacttact 


ctagcttccc 


8400 


ggcaacaatt 


aatagactgg 


atggaggcgg 


ataaagttgc 


aggaccactt 


ctgcgctcgg 


8460 


cccttccggc 


tggctggttt 


attgctgata 


aatctggagc 


cggtgagcgt 


gggtctcgcg 


8520 


gtatcattgc 


agcactgggg 


ccagatggta 


agccctcccg 


tatcgtagtt 


atctacacga 


8580 


cggggagtca 


ggcaactatg 


gatgaacgaa 


atagacagat 


cgctgagata 


ggtgcctcac 


8640 


tgattaagca 


ttggtaactg 


tcagaccaag 


tttactcata 


tatactttag 


attgatttaa 


8700 


aacttcattt 


ttaatttaaa 


aggatctagg 


tgaagatcct 


ttttgataat 


ctcatgacca 


8760 


aaatccctta 


acgtgagttt 


tcgttccact 


gagcgtcaga 


ccccgtagaa 


aagatcaaag 


8820 


gatcttcttg 


agatcctttt 


tttctgcgcg 


taatctgctg 


cttgcaaaca 


aaaaaaccac 


8880 


cgctaccagc 


ggtggtttgt 


ttgccggatc 


aagagctacc 


aactcttttt 


ccgaaggtaa 


8940 


ctggcttcag 


cagagcgcag 


ataccaaata 


ctgtccttct 


agtgtagccg 


tagttaggcc 


9000 


accacttcaa 


gaactctgta 


gcaccgccta 


catacctcgc 


tctgctaatc 


ctgttaccag 


9060 
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tggctgctgc 


cagtggcgat 


aagtcgtgtc 


ttaccgggtt 


ggactcaaga 


cgatagttac 


9120 


cggataaggc 


gcagcggtcg 


ggctgaacgg 


ggggttcgtg 


cacacagccc 


agcttggagc 


9180 


gaacgaccta 


caccgaactg 


agatacctac 


agcgtgagct 


atgagaaagc 


gccacgcttc 


9240 


ccgaagggag 


aaaggcggac 


aggtatccgg 


taagcggcag 


ggtcggaaca 


ggagagcgca 


9300 


cgagggagct 


tccaggggga 


aacgcctggt 


atctttatag 


tcctgtcggg 


tttcgccacc 


9360 


tctgacttga 


gcgtcgattt 


ttgtgatgct 


cgtcaggggg 


gcggagccta 


tggaaaaacg 


9420 


ccagcaacgc 


ggccttttta 


cggttcctgg 


ccttttgctg 


gccttttgct 


cacatgttct 


9480 


ttcctgcgtt 


atcccctgat 


tctgtggata 


accgtattac 


cgcctttgag 


tgagctgata 


9540 


ccgctcgccg 


cagccgaacg 


accgagcgca 


gcgagtcagt 


gagcgaggaa 


gcggaagagc 


9600 


gcctgatgcg 


gtattttctc 


cttacgcatc 


tgtgcggtat 


ttcacaccgc 


atatggtgca 


9660 


ctctcagtac 


aatctgctct 


gatgccgcat 


agttaagcca 


gtatacactc 


cgctatcgct 


9720 


acgtgactgg 


gtcatggctg 


cgccccgaca 


cccgccaaca 


cccgctgacg 


cgccctgacg 


9780 


ggcttgtctg 


ctcccggcat 


ccgcttacag 


acaagctgtg 


accgtctccg 


ggagctgcat 


9840 


gtgtcagagg 


ttttcaccgt 


catcaccgaa 


acgcgcgagg 


cagctgcggt 


aaagctcatc 


9900 


agcgtggtcg 


tgaagcgatt 


cacagatgtc 


tgcctgttca 


tccgcgtcca 


gctcgttgag 


9960 


tttctccaga 


agcgttaatg 


tctggcttct 


gataaagcgg 


gccatgttaa 


gggcggtttt 


10020 


ttcctgtttg 


gtcactgatg 


cctccgtgta 


agggggattt 


ctgttcatgg 


gggtaatgat 


10080 


accgatgaaa 


cgagagagga 


tgctcacgat 


acgggttact 


gatgatgaac 


atgcccggtt 


10140 


actggaacgt 


tgtgagggta 


aacaactggc 


ggtatggatg 


cggcgggacc 


agagaaaaat 


10200 


cactcagggt 


caatgccagc 


gcttcgttaa 


tacagatgta 


ggtgttccac 


agggtagcca 


10260 


gcagcatcct 


gcgatgcaga 


tccggaacat 


■ aatggtgcag 


ggcgctgact 


tccgcgtttc 


10320 


cagactttac 


gaaacacgga 


aaccgaagac 


cattcatgtt 


gttgctcagg 


tcgcagacgt 


10380 


tttgcagcag 


cagtcgcttc 


acgttcgctc 


gcgtatcggt 


gattcattct 


gctaaccagt 


10440 


aaggcaaccc 


cgccagccta 


gccgggtcct 


caacgacagg 


agcacgatca 


tgcgcacccg 


10500 


tggccaggac 


ccaacgctgc 


ccgagatgcg 


ccgcgtgcgg 


ctgctggaga 


tggcggacgc 


10560 


gatggatatg 


ttctgccaag 


ggttggtttg 


cgcattcaca 


gttctccgca 


agaattgatt 


10620 


ggctccaatt 


cttggagtgg 


tgaatccgtt 


agcgaggtgc 


cgccggcttc 


cattcaggtc 


10680 


gaggtggccc 


ggctccatgc 


accgcgacgc 


aacgcgggga 


ggcagacaag 


gtatagggcg 


10740 


gcgcctacaa 


tccatgccaa 


cccgttccat 


gtgctcgccg 


aggcggcata 


aatcgccgtg 


10800 


acgatcagcg 


gtccaatgat 


cgaagttagg 


ctggtaagag 


ccgcgagcga 


tccttgaagc 


10860 
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tgtccctgat 


ggtcgtcatc 


tacctgcctg 


gacagcatgg 


cctgcaacgc 


gggcatcccg 


10920 


atgccgccgg 


aagcgagaag 


aatcataatg 


gggaaggcca 


tccagcctcg 


cgtcgcgaac 


10980 


gccagcaaga 


cgtagcccag 


cgcgtcggcc 


gccatgccgg 


cgataatggc 


ctgcttctcg 


11040 


ccgaaacgtt 


tggtggcggg 


accagtgacg 


aaggcttgag 


cgagggcgtg 


caagattccg 


11100 


aataccgcaa 


gcgacaggcc 


gatcatcgtc 


gcgctccagc 


gaaagcggtc 


ctcgccgaaa 


11160 


atgacccaga 


gcgctgccgg 


cacctgtcct 


acgagttgca 


tgataaagaa 


gacagtcata 


11220 


agtgcggcga 


cgatagtcat 


gccccgcgcc 


caccggaagg 


agctgactgg 


gttgaaggct 


11280 


ctcaagggca 


tcggtcgagg 


atccttcaat 


atgcgcacat 


acgctgttat 


gttcaaggtc 


11340 


ccttcgttta 


agaacgaaag 


cggtcttcct 


tttgagggat 


gtttcaagtt 


gttcaaatct 


11400 


atcaaatttg 


caaatcccca 


gtctgtatct 


agagcgttga 


atcggtgatg 


cgatttgtta 


11460 


attaaattga 


tggtgtcacc 


attaccaggt 


ctagatatac 


caatggcaaa 


ctgagcacaa 


11520 


caataccagt 


ccggatcaac 


tggcaccatc 


tctcccgtag 


tctcatctaa 


tttttcttcc 


11580 


ggatgaggtt 


ccagatatac 


cgcaacacct 


ttattatggt 


ttccctgagg 


gaataataga 


11640 


atgtcccatt 


cgaaatcacc 


aattctaaac 


ctgggcgaat 


tgtatttcgg 


gtttgttaac 


11700 


tcgttccagt 


caggaatgtt 


ccacgtgaag 


ctatcttcca 


gcaaagtctc 


cacttcttca 


11760 


tcaaattgtg 


gagaatactc 


ccaatgctct 


tatctatggg 


acttccggga 


aacacagtac 


11820 


cgatacttcc 


caattcgtct 


tcagagctca 


ttgtttgttt 


gaagagacta 


atcaaagaat 


11880 


cgttttctca 


aaaaaattaa 


tatcttaact 


gatagtttga 


tcaaaggggc 


aaaacgtagg 


11940 


ggcaaacaaa 


cggaaaaatc 


gtttctcaaa 


ttttctgatg 


ccaagaactc 


taaccagtct 


12000 


tatctaaaaa 


ttgccttatg 


atccgtctct 


ccggttacag 


cctgtgtaac 


tgattaatcc 


12060 


tgcctttcta 


atcaccattc 


taatgtttta 


attaagggat 


tttgtcttca 


ttaacggctt 


12120 


tcgctcataa 


aaatgttatg 


acgttttgcc 


cgcaggcggg 


aaaccatcca 


cttcacgaga 


12180 


ctgatctcct 


ctgccggaac 


accgggcatc 


tccaacttat 


aagttggaga 


aataagagaa 


12240 


tttcagattg 


agagaatgaa 


aaaaaaaaac 


ccttagttca 


taggtccatt 


ctcttagcgc 


12300 


aactacagag 


aacaggggca 


caaacaggca 


aaaaacgggc 


acaacctcaa 


tggagtgatg 


12360 


caacctgcct 


ggagtaaatg 


atgacacaag 


gcaattgacc 


cacgcatgta 


tctatctcat 


12420 


tttcttacac 


cttctattac 


cttctgctct 


ctctgatttg 


gaaaaagctg 


aaaaaaaagg 


12480 


ttgaaaccag 


ttccctgaaa 


ttattcccct 


acttgactaa 


taagtatata 


aagacggtag 


12540 


gtattgattg 


taattctgta 


aatctatttc 


ttaaacttct 


taaattctac 


ttttatagtt 


12600 


agtctttttt 


ttagttttaa 


aacaccaaga 


acttagtttc 


gaataaacac 


acataaacaa 


12660 
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acaagcttac aaaacaaa atg get gca tat gca get cag ggc tat aag gtg 12711 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val 
15 10 

eta gta etc aac ccc tct gtt get gca aea ctg ggc ttt ggt get tae 12759 
Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
15 20 25 

atg tec aag get cat ggg ate gat cct aac ate agg acc ggg gtg aga 12807 
Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg 
30 35 40 

aca att acc act ggc age ccc ate acg tae tec acc tac ggc aag ttc 12855 
Thr lie Thr Thr Gly Ser Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe 
45 50 55 

ctt gee gac ggc ggg tgc teg ggg ggc get tat gac ata ata att tgt 12903 
Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He lie Cys 
60 65 70 75 

gac gag tgc cac tec acg gat gee aea tec ate ttg ggc att ggc act 12951 
Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu Gly He Gly Thr 
80 85 90 

gtc ctt gac caa gca gag act gcg ggg gcg aga ctg gtt gtg etc gee 12999 
Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
95 100 105 

ace gee acc cct ecg ggc tee gtc act gtg ccc cat ccc aac ate gag 13047 
Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu 
110 115 120 

gag gtt get ctg tec acc acc gga gag ate cct ttt tac ggc aag get 13095 
Glu Val Ala Leu Ser Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala 
125 130 135 

ate ccc etc gaa gta ate aag ggg ggg aga cat etc ate ttc tgt cat 13143 
He Pro Leu Glu Val He Lys Gly Gly Arg His Leu He Phe Cys His 
140 145 150 155 

tea aag aag aag tgc gac gaa etc gee gca aag ctg gtc gca ttg ggc 13191 
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly 
160 165 170 

ate aat gee gtg gee tac tae egc ggt ctt gac gtg tec gtc ate ecg 13239 
He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
175 180 185 

acc age ggc gat gtt gtc gtc gtg gca ace gat gee etc atg acc ggc 13287 
Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
190 195 200 

tat acc ggc gac ttc gac teg gtg ata gac tgc aat acg tgt gtc acc 13335 
Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
205 210 215 

cag aca gtc gat ttc age ctt gac cct acc ttc acc att gag aca ate 13383 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr He 
220 225 230 235 
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acg etc ccc caa gat get gtc tee ege act caa cgt egg ggc agg act 13431 
Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr 
240 245 250 

ggc agg ggg aag cea ggc ate tac aga ttt gtg gea ceg ggg gag ege 13479 
Gly Arg Gly Lys Pro Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg 
255 260 265 

ccc tec ggc atg ttc gac teg tec gtc etc tgt gag tge tat gac gea 13527 
Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
270 275 280 

ggc tgt get tgg tat gag etc acg ccc gee gag act aca gtt agg eta 13575 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu 
285 290 295 

cga geg tac atg aac ace ceg ggg ctt ccc gtg tgc cag gac cat ctt 13623 
Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
300 305 310 315 

gaa ttt tgg gag ggc gtc ttt aca ggc etc act eat ata gat gcc cac 13671 
Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His lie Asp Ala His 
320 325 330 

ttt eta tec cag aca aag cag agt ggg gag aac ctt ect tac etg gta 13719 
Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val 
335 340 345 

geg tac caa gcc acc gtg tgc get agg get caa gcc ect ccc eca teg 13767 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 

350 355 360 

tgg gac cag atg tgg aag tgt ttg att cgc etc aag ccc acc etc cat 13815 
Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His 
365 370 375 

ggg cca aca ccc etg eta tac aga ctg ggc get gtt cag aat gaa ate 13 863 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie 
380 385 390 395 

acc etg acg cac cca gtc acc aaa tac ate atg aca tgc atg teg gcc 13911 
Thr Leu Thr His Pro Val Thr Lys Tyr lie Met Thr Cys Met Ser Ala 
400 405 410 

gac ctg gag gtc gtc acg age acc tgg gtg etc gtt ggc ggc gtc etg 13959 
Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu 
415 420 425 

get get ttg gee geg tat tgc ctg tea aca ggc tgc gtg gtc ata gtg 14007 
Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val lie Val 
430 435 440 

ggc agg gtc gtc ttg tee ggg aag ceg gea ate ata ect gac agg gaa 14055 
Gly Arg Val Val Leu Ser Gly Lys Pro Ala He He Pro Asp Arg Glu 
445 450 455 

gtc etc tac cga gag ttc gat gag atg gaa gag tge tet cag cac tta 14103 
Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu 
460 465 470 475 
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ccg tac ate gag caa ggg atg atg etc gee gag cag ttc aag cag aag 14151 
Pro Tyr He Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys 
480 485 490 

gcc etc ggc etc ctg cag acc gcg tec cgt cag gca gag gtt ate gee 14199 
Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala 
495 500 505 

cct get gtc cag acc aac tgg caa aaa etc gag acc ttc tgg gcg aag 14247 
Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys 
510 515 520 

cat atg tgg aac ttc ate agt ggg ata caa tac ttg gcg ggc ttg tea 14295 
His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser 
525 530 535 

acg ctg cct ggt aac ccc gcc att get tea ttg atg get ttt aca get 14343 
Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala 

540 545 550 555 

get gtc ace age eca eta acc act age caa acc etc etc ttc aac ata 14391 
Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He 
560 565 570 

ttg ggg ggg tgg gtg get gcc cag etc gcc gee ccc ggt gcc get act 14439 
Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr 
575 580 585 

gee ttt gtg ggc get ggc tta get ggc gcc gcc ate ggc agt gtt gga 14487 
Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly 
590 595 600 

ctg ggg aag gtc etc ata gae ate ctt gca ggg tat ggc gcg ggc gtg 14535 
Leu Gly Lys Val Leu He Asp He Leu Ala Gly Tyr Gly Ala Gly Val 
605 610 615 

gcg gga get ctt gtg gca ttc aag ate atg age ggt gag gtc ccc tec 14583 
Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu Val Pro Ser 
620 625 630 635 

acg gag gac ctg gtc aat eta ctg ccc gcc ate etc teg ccc gga gcc 14631 
Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala 
640 645 650 

etc gta gtc ggc gtg gtc tgt gca gca ata ctg cgc egg cac gtt ggc 14679 
Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His Val Gly 
655 660 665 

ccg ggc gag ggg gca gtg cag tgg atg aac egg ctg ata gcc ttc gcc 14727 
Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala Phe Ala 
670 675 680 

tec egg ggg aac cat gtt tec ccc acg cac tac gtg ccg gag age gat 14775 
Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp 
685 690 695 

gca get gcc cgc gtc act gcc ata etc age age etc act gta acc cag 14 823 
Ala Ala Ala Arg Val Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin 
700 705 710 715 
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etc ctg agg cga ctg cac cag tgg ata age teg gag tgt ace act oca 14871 
Leu Leu Arg Arg Leu His Gin Trp lie Ser Ser Glu Cys Thr Thr Pro 
720 725 730 

tgc tec ggt tec tgg eta agg gac ate tgg gae tgg ata tgc gag gtg 14919 
Cys Ser Gly Ser Trp Leu Arg Asp lie Trp Asp Trp He Cys Glu Val 
735 740 745 

ttg age gac ttt aag ace tgg eta aaa get aag etc atg cea cag ctg 14967 
Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu 
750 755 760 

cct ggg ate cec ttt gtg tec tgc cag cgc ggg tat aag ggg gtc tgg 15015 
Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp 
765 770 775 

cga ggg gac ggc ate atg cac act cgc tgc cac tgt gga get gag ate 15063 
Arg Gly Asp Gly He Met His Thr Arg Cys His Cys Gly Ala Glu He 
780 785 790 795 

act gga cat gtc aaa aac ggg aeg atg agg ate gtc ggt cct agg aec 15111 
Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly Pro Arg Thr 
800 805 810 

tgc agg aac atg tgg agt ggg acc ttc cec att aat gee tac ace aeg 15159 
Cys Arg Asn Met Trp Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr 
815 820 825 

ggc cec tgt acc cec ett cct geg eeg aac tac aeg ttc gcg eta tgg 15207 
Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp 
830 835 840 

agg gtg tet gea gag gaa tac gtg gag ata agg cag gtg ggg gae ttc 15255 
Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe 
845 850 855 

cac tac gtg aeg ggt atg act act gac aat ett aaa tgc ccg tgc cag 15303 
His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin 
860 865 870 875 

gtc cea teg cec gaa ttt ttc aca gaa ttg gac ggg gtg cgc eta cat 15351 
Val Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His 
880 885 890 

agg ttt gcg cec cec tgc aag cec ttg ctg egg gag gag gta tea ttc 15399 
Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe 
895 900 905 

aga gta gga etc cac gaa tac ccg gta ggg teg caa tta cct tgc gag 15447 
Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu 
910 915 920 

cec gaa ccg gac gtg gcc gtg ttg aeg tec atg etc act gat cec tec 15495 
Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser 
925 930 935 

cat ata aca gea gag gcg gcc ggg cga agg ttg gcg agg gga tea cec 15543 
His He Thr Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro 
940 945 950 955 
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ccc tct gtg gcc age tec teg get age cag eta tec get cea tct etc 

Pro Ser Val Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu 

960 965 970 

aag gea act tge ace get aac cat gac tec cct gat get gag etc ata 

Lys Ala Thr Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu lie 
975 980 985 



gag gcc aac etc eta tgg agg cag gag atg ggc ggc aac ate ace agg 
Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg 
990 995 1000 

gtt gag tea gaa aac aaa gtg gtg att ctg gac tec ttc gat ccg ctt 
Val Glu Ser Glu Asn Lys Val Val lie Leu Asp Ser Phe Asp Pro Leu 
1005 1010 1015 

gtg gcg gag gag gac gag egg gag ate tec gta ccc gea gaa ate ctg 
Val Ala Glu Glu Asp Glu Arg Glu lie Ser Val Pro Ala Glu lie Leu 
1020 1025 1030 1035 

egg aag tct egg aga ttc gcc cag gcc ctg ccc gtt tgg gcg egg ccg 
Arg Lys Ser Arg Arg Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro 
1040 1045 1050 

gac tat aac ccc ccg eta gtg gag acg tgg aaa aag ccc gac tac gaa 
Asp Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu 
1055 1060 1065 

cea cct gtg gtc cat ggc tgc ccg ctt cea cct cea aag tec cct cct 
Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro 
1070 1075 1080 

gtg cct ccg cct egg aag aag egg acg gtg gtc etc act gaa tea acc 
Val Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr 
1085 1090 1095 

eta tct act gcc ttg gcc gag etc gcc acc aga age ttt ggc age tec 
Leu Ser Thr Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser 
1100 1105 1110 1115 

tea act tec ggc att acg ggc gac aat acg aca aca tec tct gag ccc 
Ser Thr Ser Gly lie Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro 
1120 1125 1130 

gcc cct tct ggc tgc ccc ccc gac tec gac get gag tec tat tec tec 
Ala Pro Ser Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser 
1135 1140 1145 

atg ccc ccc ctg gag ggg gag cct ggg gat ccg gat ctt age gac ggg 
Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly 
1150 1155 1160 

tea tgg tea acg gtc agt agt gag gcc aac gcg gag gat gtc gtg tgc 
Ser Trp Ser Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys 
1165 1170 1175 

tgc tea atg tct tac tct tgg aca ggc gea etc gtc acc ccg tge gcc 
Cys Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala 
1180 1185 1190 1195 



15591 



15639 



15687 



15735 



15783 



15831 



15879 



15927 



15975 



16023 



16071 



16119 



16167 



16215 



16263 
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gcg gaa gaa cag aaa ctg ccc ate aat gca eta age aac teg ttg eta 
Ala Glu Glu Gin Lys Leu Pro He Asn Ala Leu Ser Asn Ser Leu Leu 
1200 1205 1210 



16311 



egt cac eae aat ttg gtg tat tee ace ace tea ege agt get tgc eaa 
Arg His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin 
1215 1220 1225 



16359 



agg cag aag aaa gtc aca ttt gac aga ctg caa gtt ctg gac age cat 
Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His 
1230 1235 1240 



16407 



tac cag gae gta etc aag gag gtt aaa gca gcg gcg tea aaa gtg aag 
Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys 
1245 1250 1255 



16455 



get aac ttg eta tec gta gag gaa get tgc age ctg aeg ccc cca cac 
Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His 
1260 1265 1270 1275 



16503 



tea gee aaa tec aag ttt ggt tat ggg gca aaa gac gtc egt tgc cat 
Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His 
1280 1285 1290 



16551 



gee aga aag gee gta ace cac ate aac tec gtg tgg aaa gac ctt ctg 
Ala Arg Lys Ala Val Thr His He Asn Ser Val Trp Lys Asp Leu Leu 
1295 1300 1305 



16599 



gaa gac aat gta aca cca ata gac act ace ate atg get aag aac gag 
Glu Asp Asn Val Thr Pro He Asp Thr Thr He Met Ala Lys Asn Glu 
1310 1315 1320 



16647 



gtt ttc tgc gtt cag cet gag aag ggg ggt egt aag cca get egt etc 
Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu 
1325 1330 1335 



16695 



ate gtg ttc ccc gat ctg ggc gtg ege gtg tgc gaa aag atg get ttg 
He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu 
1340 1345 1350 1355 



16743 



tac gac gtg gtt aca aag etc ccc ttg gee gtg atg gga age tec tac 
Tyr Asp Val Val Thr Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr 
1360 1365 1370 



16791 



gga ttc caa tac tea cca gga cag egg gtt gaa ttc etc gtg caa gcg 
Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala 
1375 1380 1385 



16839 



tgg aag tec aag aaa ace cca atg ggg ttc teg tat gat ace ege tgc 
Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys 
1390 1395 1400 



16887 



ttt gac tec aca gtc act gag age gac ate egt aeg gag gag gca ate 
Phe Asp Ser Thr Val Thr Glu Ser Asp He Arg Thr Glu Glu Ala He 
1405 1410 1415 



16935 



tac caa tgt tgt gae etc gac ccc eaa gee ege gtg gee ate aag tec 
Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala He Lys Ser 
1420 1425 1430 1435 



16983 
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etc acc gag agg ctt tat gtt ggg ggc cct ctt acc aat tea agg ggg 
Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly 
1440 1445 1450 



17031 



gag aac tge ggc tat cgc agg tgc cgc gcg age ggc gta ctg aca act 
Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr 
1455 1460 1465 



17079 



age tgt ggt aac acc etc act tgc tac ate aag gee egg gca gcc tgt 
Ser Cys Gly Asn Thr Leu Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys 
1470 1475 1480 



17127 



cga gcc gca ggg etc cag gae tge acc atg etc gtg tgt ggc gae gac 
Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp 
1485 1490 1495 



17175 



tta gtc gtt ate tgt gaa age gcg ggg gtc cag gag gae gcg gcg age 
Leu Val Val lie Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser 
1500 1505 1510 1515 



17223 



ctg aga gcc ttc acg gag get atg ace agg tac tec gcc ccc cct ggg 
Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly 
1520 1525 1530 



17271 



gac ccc cca caa cca gaa tac gac ttg gag etc ata aca tea tgc tec 
Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser 
1535 1540 1545 



17319 



tec aac gtg tea gtc gcc cac gac ggc get gga aag agg gtc tac tac 
Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr 
1550 1555 1560 



17367 



etc ace cgt gac cct aca acc ccc etc gcg aga get gcg tgg gag aca 
Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr 
1565 1570 1575 



17415 



gca aga cac act cca gtc aat tec tgg eta ggc aac ata ate atg ttt 
Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn lie lie Met Phe 
1580 1585 1590 1595 



17463 



gee ccc aca ctg tgg gcg agg atg ata ctg atg acc cat ttc ttt age 
Ala Pro Thr Leu Trp Ala Arg Met lie Leu Met Thr His Phe Phe Ser 
1600 1605 1610 



17511 



gtc ctt ata gee agg gac cag ctt gaa cag gcc etc gat tgc gag ate 
Val Leu lie Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu lie 
1615 1620 1625 



17559 



tac ggg gcc tgc tac tec ata gaa cca ctg gat eta cct cca ate att 
Tyr Gly Ala Cys Tyr Ser lie Glu Pro Leu Asp Leu Pro Pro lie lie 
1630 1635 1640 



17607 



caa aga etc cat ggc etc age gca ttt tea etc cac agt tac tct cca 
Gin Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro 
1645 1650 1655 



17655 



ggt gaa ate aat agg gtg gcc gca tge etc aga aaa ctt ggg gta ecg 
Gly Glu lie Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro 
1660 1665 1670 1675 



17703 
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ccc ttg cga get tgg aga cac egg gee egg age gtc cge get agg ctt 17751 
Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu 
1680 1685 1690 

ctg gee aga gga ggc agg get gee ata tgt ggc aag tae ete tte aac 17799 
Leu Ala Arg Gly Gly Arg Ala Ala lie Cys Gly Lys Tyr Leu Phe Asn 
1695 1700 1705 

tgg gca gta aga aea aag etc aaa etc act cca ata gcg gee get ggc 17847 
Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro lie Ala Ala Ala Gly 
1710 1715 1720 

cag ctg gac ttg tec ggc tgg tte acg get ggc tac age ggg gga gae 17895 
Gin Leu Asp Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp 
1725 1730 1735 

att tat cac age gtg tct cat gee egg ccc cgc tgg ate tgg ttt tgc 17943 
lie Tyr His Ser Val Ser His Ala Arg Pro Arg Trp lie Trp Phe Cys 
1740 1745 1750 1755 

eta etc ctg ctt get gca ggg gta ggc ate tac etc etc ccc aac cga 17991 
Leu Leu Leu Leu Ala Ala Gly Val Gly lie Tyr Leu Leu Pro Asn Arg 
1760 1765 1770 

atg age acg aat cct aaa cct caa aga aag ace aaa cgt aac ace aac 18039 
Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1775 1780 1785 

egg egg ccg cag gae gtc aag tte ceg ggt ggc ggt cag ate gtt ggt 18087 
Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
1790 1795 1800 

gga gtt tac ttg ttg ceg cgc agg ggc ect aga ttg ggt gtg cgc gcg 18135 
Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
1805 1810 1815 

acg aga aag act tec gag egg teg caa cct cga ggt aga cgt cag cct 18183 
Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
1820 1825 1830 1835 

ate ccc aag get cgt egg ccc gag ggc agg ace tgg get cag ccc ggg 18231 
lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
1840 1845 1850 

tac ect tgg ccc etc tat ggc aat gag ggc tgc ggg tgg gcg gga tgg 18279 
Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
1855 1860 1865 

etc ctg tct ccc cgt ggc tct egg ect age tgg ggc ccc aca gae ccc 18327 
Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
1870 1875 1880 

egg egt agg teg cgc aat ttg ggt aag gtc ate gat ace ctt acg tgc 18375 
Arg Arg Arg Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
1885 1890 1895 

ggc tte gee gac etc atg ggg tac ata ceg etc gtc ggc gee cct ctt 18423 
Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala Pro Leu 
1900 1905 1910 1915 
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gga ggc get gcc agg gcc ctg gcg cat ggc gtc egg gtt ctg gaa gac 18471 
Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
1920 1925 1930 



g9c gtg aae tat gca aca ggg aac ctt cct ggt tgc tct taatagtcga 18520 
Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser 
1935 1940 



ctttgttccc 


actgtacttt 


tagctcgtac 


aaaatacaat 


atacttttca 


tttctccgta 


18580 


aacaacatgt 


tttcccatgt 


aatatccttt 


tctatttttc 


gttccgttac 


caactttaca 


18640 


catactttat 


atagctattc 


acttctatac 


actaaaaaac 


taagacaatt 


ttaattttgc 


18700 


tgcctgccat 


atttcaattt 


gttataaatt 


cctataattt 


atcctattag 


tagctaaaaa 


18760 


aagatgaatg 


tgaatcgaat 


cctaagagaa 


ttggatctga 


tecacaggac 


gggtgtggtc 


18820 


gccatgatcg 


cgtagtcgat 


agtggctcca 


agtagcgaag 


cgagcaggac 


tgggeggcgg 


18880 


ccaaagcggt 


cggacagtgc 


tccgagaacg 


ggtgegcata 


gaaattgcat 


caacgcatat 


18940 


agcgctagca 


gcacgccata 


gtgactggcg 


atgctgtcgg 


aatggacgat 


atcccgcaag 


19000 


aggcccggca 


gtaccggcat 


aaccaagcct 


atgcctacag 


catccagggt 


gacggtgccg 


19060 


aggatgacga 


tgagcgcatt 


gttagatttc 


atacaeggtg 


cctgactgcg 


ttagcaattt 


19120 


aactgtgata 


aactaecgca 


ttaaagcttt 


ttctttccaa 


tttttttttt 


ttcgtcatta 


19180 


taaaaatcat 


tacgaccgag 


attcccgggt 


aataactgat 


ataattaaat 


tgaagctcta 


19240 


atttgtgagt 


ttagtataca 


tgeatttact 


tataatacag 


ttttttagtt 


ttgctggccg 


19300 


catcttctca 


aatatgcttc 


ccagcctgct 


tttctgtaac 


gttcaccctc 


taccttagca 


19360 


tcccttccct 


ttgcaaatag 


tcctcttcca 


acaataataa 


tgtcagatcc 


tgtagagacc 


19420 


acatcatcca 


cggttctata 


ctgttgaccc 


aatgcgtctc 


ccttgtcatc 


taaacccaca 


19480 


ccgggtgtca 


taatcaacca 


atcgtaacct 


tcatctcttc 


cacccatgtc 


tctttgagca 


19540 


ataaagccga 


taacaaaatc 


tttgtcgctc 


ttcgcaatgt 


caacagtacc 


cttagtatat 


19600 


tctccagtag 


atagggagce 


cttgcatgac 


aattctgcta 


acatcaaaag 


gcctctaggt 


19660 


tcctttgtta 


cttettctgc 


cgcctgcttc 


aaaccgctaa 


caatacetgg 


gcccaccaca 


19720 


ccgtgtgcat 


tcgtaatgtc 


tgcccattct 


gctattctgt 


atacacccgc 


agagtactgc 


19780 


aatttgactg 


tattaccaat 


gtcagcaaat 


tttctgtctt 


cgaagagtaa 


aaaattgtac 


19840 


ttggcggata 


atgcctttag 


cggcttaact 


gtgccctcca 


tggaaaaatc 


agtcaagata 


19900 


tccaeatgtg 


tttttagtaa 


acaaattttg 


ggaectaatg 


cttcaactaa 


ctccagtaat 


19960 


tccttggtgg 


tacgaacatc 


caatgaagca 


cacaagtttg 


tttgcttttc 


gtgcatgata 


20020 


ttaaatagct 


tggcagcaac 


aggactagga 


tgagtagcag 
124 


cacgttcctt 


atatgtagct 


20080 
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ttcgacatga tttatcttcg tttcctgcag gtttttgttc tgtgcagttg ggttaagaat 20140 

actgggcaat ttcatgtttc ttcaacacta catatgcgta tatataccaa tctaagtctg 20200 

tgctccttcc ttcgttcttc cttctgttcg gagattaccg aatcaaaaaa atttcaagga 20260 

aaccgaaatc aaaaaaaaga ataaaaaaaa aatgatgaat tgaaaagctt atcgat 20316 



<210> 15 
<211> 1944 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
pd . delta , NS3NS5 . p j . corel73 

<400> 15 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
15 10 15 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His 
20 25 30 

Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr Thr Gly 
35 40 45 

Ser Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly 
50 55 60 

Cys Ser Gly Gly Ala Tyr Asp He He He Cys Asp Glu Cys His Ser 
65 70 75 80 

Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala 
85 90 95 

Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro 
100 105 110 

Gly Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser 
115 120 125 

Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val 
130 135 140 

He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys 
145 150 155 160 

Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala 
165 170 175 

Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val 
180 185 190 

Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 
195 200 205 

Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
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210 215 220 

Ser Leu Asp Pro Thr Phe Thr lie Glu Thr He Thr Leu Pro Gin Asp 
225 230 235 240 

Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro 
245 250 255 

Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe 
260 265 270 

Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr 
275 280 285 

Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn 
290 295 300 

Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly 

305 310 315 320 

Val Phe Thr Gly Leu Thr His lie Asp Ala His Phe Leu Ser Gin Thr 
325 330 335 

Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr 
340 345 350 

Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp 
355 360 365 

Lys Cys Leu He Arg Leu Lys Pro Thr Leu His Gly Pro thr Pro Leu 

370 375 380 

Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu He Thr Leu Thr His Pro 
385 390 395 400 

Val Thr Lys Tyr He Met Thr Cys Met Ser Ala Asp Leu Glu Val Val 
405 410 415 

Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
420 425 430 

Tyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg Val Val Leu 
435 440 445 

Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Arg Glu 
450 455 460 

Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin 
465 470 475 ^ 480 

Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu 

485 490 495 

Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala Val Gin Thr 
500 505 510 

Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe 
515 520 525 
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lie Ser Gly lie Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 
530 535 540 

Pro Ala lie Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro 

545 550 555 560 

Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn lie Leu Gly Gly Trp Val 
565 570 ' 575 

Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 
580 585 590 

Gly Leu Ala Gly Ala Ala lie Gly Ser Val Gly Leu Gly Lys Val Leu 
595 600 605 

He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 
610 615 620 

Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 
625 630 635 640 

Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val 
645 650 655 

Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
660 665 670 

Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 
675 680 685 

Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
690 695 700 

Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu 
705 710 715 720 

His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp 
725 730 735 

Leu Arg Asp He Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys 
740 745 750 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe 
755 760 765 

Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He 
770 775 780 

Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His Val Lys 
785 790 795 800 

Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
805 810 815 

Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
820 825 830 

Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser Ala Glu 
835 840 845 
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Glu Tyr Val Glu lie Arg Gin Val Gly Asp Phe His Tyr Val Thr Gly 
850 855 860 

Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser Pro Glu 
865 870 875 880 

Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro 
885 890 895 

Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His 
900 905 910 

Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val 
915 920 925 

Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu 
930 935 940 

Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala Ser 
945 950 955 960 

Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 
965 970 975 

Ala Asn His Asp Ser Pro Asp Ala Glu Leu He Glu Ala Asn Leu Leu 
980 985 990 

Trp Arg Gin Glu Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn 
995 1000 1005 

Lys Val Val He Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp 
1010 1015 1020 

Glu Arg Glu He Ser Val Pro Ala Glu He Leu Arg Lys Ser Arg Arg 
025 1030 1035 1040 

Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro 
1045 1050 1055 

Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His 
1060 1065 1070 

Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro Pro Arg 
1075 1080 1085 

Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu 
1090 1095 1100 

Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser Gly He 
105 1110 1115 1120 

Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys 
1125 1130 1135 

Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu 
1140 1145 1150 

Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val 
1155 1160 1165 
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Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met Ser Tyr 
1170 1175 1180 

Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys 
185 1190 1195 1200 

Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu 
1205 1210 1215 

Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val 
1220 1225 1230 

Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu 
1235 1240 1245 

Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser 
1250 1255 1260 

Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys 
265 1270 1275 1280 

Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val 
1285 1290 1295 

Thr His lie Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr 
1300 1305 1310 

Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin 
1315 1320 1325 

Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp 
1330 1335 1340 

Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr 
345 1350 1355 1360 

Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser 
1365 1370 1375 

Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys 
1380 1385 1390 

Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 
1395 1400 1405 

Thr Glu Ser Asp lie Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp 
1410 1415 1420 

Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser Leu Thr Glu Arg Leu 
425 1430 1435 1440 

Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr 
1445 1450 1455 

Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr 
1460 1465 1470 

Leu Thr Cys Tyr He Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu 
1475 1480 1485 
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Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val lie Cys 
1490 1495 1500 

Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr 
505 1510 1515 1520 

Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro 
1525 1530 1535 

Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val 
1540 1545 1550 

Ala His Asp Gly' Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro 
1555 1560 1565 

Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro 
1570 1575 1580 

Val Asn Ser Trp Leu Gly Asn He He Met Phe Ala Pro Thr Leu Trp 
585 1590 1595 1600 

Ala Arg Met He Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg 
1605 1610 1615 

Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He Tyr Gly Ala Cys Tyr 
1620 1625 1630 

Ser He Glu Pro Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly 
1635 1640 1645 

Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg 
1650 1655 1660 

Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp 
665 1670 1675 1680 

Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly 
1685 1690 1695 

Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr 
1700 1705 1710 

Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly Gin Leu Asp Leu Ser 
1715 1720 1725 

Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val 
1730 1735 1740 

Ser His Ala Arg Pro Arg Trp He Trp Phe Cys Leu Leu Leu Leu Ala 
745 1750 1755 1760 

Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg Met Ser Thr Asn Pro 
1765 1770 1775 

Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp 
1780 1785 1790 

Val Lys Phe Pro Gly Gly Gly Gin He Val Gly Gly Val Tyr Leu Leu 
1795 1800 1805 
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Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser 
1810 1815 1820 

Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro lie Pro Lys Ala Arg 
825 1830 1835 1840 

Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly Tyr Pro Trp Pro Leu 
1845 1850 1855 

Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg 
1860 1865 1870 

Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg 
1875 1880 1885 

Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
1890 1895 1900 

Met Gly Tyr lie Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 
905 1910 1915 1920 

Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 
1925 1930 1935 

Thr Gly Asn Leu Pro Gly Cys Ser 
1940 
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atcgatccta ccccttgcgc 


taaagaagta 


tatgtgccta 


ctaacgcttg tctttgtctc 


60 


tgtcactaaa cactggatta 


ttactcccag 


atacttattt 


tggactaatt taaatgattt 


120 


cggatcaacg ttcttaatat 


cgctgaatct 


tccacaattg 


atgaaagtag ctaggaagag 


180 


gaattggtat aaagtttttg 


tttttgtaaa 


tctcgaagta 


tactcaaacg aatttagtat 


240 


tttctcagtg atctcccaga 


tgctttcacc 


ctcacttaga 


agtgctttaa gcattttttt 


300 


actgtggcta tttcccttat 


ctgcttcttc 


cgatgattcg 


aactgtaatt gcaaactact 


360 


tacaatatca gtgatatcag 


attgatgttt 


ttgtccatag 


taaggaataa ttgtaaattc 


420 


ccaagcagga atcaatttct 


ttaatgaggc 


ttccagaatt 


gttgcttttt gcgtcttgta 


480 


tttaaactgg agtgatttat 


tgacaatatc 


gaaactcagc 


gaattgctta tgatagtatt 


540 
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atagctcatg 


aatgtggctc 


tcttgattgc 


tgttccgtta 


tgtgtaatca 


tccaacataa 


600 


ataggttagt 


tcagcagcac 


ataatgctat 


tttctcacct 


gaaggtcttt 


caaacctttc 


660 


cacaaactga 


cgaacaagca 


ccttaggtgg 


tgttttacat 


aatatatcaa 


attgtggcat 


720 


gcttagcgcc 


gatcttgtgt 


gcaattgata 


tctagtttca 


actactctat 


ttatcttgta 


780 


tcttgcagta 


ttcaaacacg 


ctaactcgaa 


aaactaactt 


taattgtcct 


gtttgtctcg 


840 


cgttctttcg 


aaaaatgcac 


cggccgcgca 


ttatttgtac 


tgcgaaaata 


attggtactg 


900 


cggtatcttc 


atttcatatt 


ttaaaaatgc 


acctttgctg 


cttttcctta 


atttttagac 


960 


ggcccgcagg 


ttcgttttgc 


ggtactatct 


tgtgataaaa 


agttgttttg 


acatgtgatc 


1020 


tgcacagatt 


ttataatgta 


ataagcaaga 


atacattatc 


aaacgaacaa 


tactggtaaa 


1080 


agaaaaccaa 


aatggacgac 


attgaaacag 


ccaagaatct 


gacggtaaaa 


gcacgtacag 


1140 


cttatagcgt 


ctgggatgta 


tgtcggctgt 


ttattgaaat 


gattgctcct 


gatgtagata 


1200 


ttgatataga 


gagtaaacgt 


aagtctgatg 


agctactctt 


tccaggatat 


gtcataaggc 


1260 


ccatggaatc 


tctcacaacc 


ggtaggccgt 


atggtcttga 


ttctagcgca 


gaagattcca 


1320 


gcgtatcttc 


tgactccagt 


gctgaggtaa 


ttttgcctgc 


tgcgaagatg 


gttaaggaaa 


1380 


ggtttgattc 


gattggaaat 


ggtatgctct 


cttcacaaga 


agcaagtcag 


gctgccatag 


1440 


atttgatgct 


acagaataac 


aagctgttag 


acaatagaaa 


gcaactatac 


aaatctattg 


1500 


ctataataat 


aggaagattg 


cccgagaaag 


acaagaagag 


agctaccgaa 


atgctcatga 


1560 


gaaaaatgga 


ttgtacacag 


ttattagtcc 


caccagctcc 


aacggaagaa 


gatgttatga 


1620 


agctcgtaag 


cgtcgttacc 


caattgctta 


ctttagttcc 


accagatcgt 


caagctgctt 


1680 


taataggtga 


tttattcatc 


ccggaatctc 


taaaggatat 


attcaatagt 


ttcaatgaac 


1740 


tggcggcaga 


gaatcgttta 


cagcaaaaaa 


agagtgagtt 


ggaaggaagg 


actgaagtga 


1800 


accatgctaa 


tacaaatgaa 


gaagttccct 


ccaggcgaac 


aagaagtaga 


gacacaaatg 


1860 


caagaggagc 


atataaatta 


caaaacacca 


tcactgaggg 


ccctaaagcg 


gttcccacga 


1920 


aaaaaaggag 


agtagcaacg 


agggtaaggg 


gcagaaaatc 


acgtaatact 


tctagggtat 


1980 


gatccaatat 


caaaggaaat 


gatagcattg 


aaggatgaga 


ctaatccaat 


tgaggagtgg 


2040 


cagcatatag 


aacagctaaa 


gggtagtgct 


gaaggaagca 


tacgataccc 


cgcatggaat 


2100 


gggataatat 


cacaggaggt 


actagactac 


ctttcatcct 


acataaatag 


acgcatataa 


2160 


gtacgcattt 


aagcataaac 


acgcactatg 


ccgttcttct 


catgtatata 


tatatacagg 


2220 


caacacgcag 


atataggtgc 


gacgtgaaca 


gtgagctgta 


tgtgcgcagc 


tcgcgttgca 


2280 


ttttcggaag 


cgctcgtttt 


cggaaacgct 


ttgaagttcc 


tattccgaag 


ttcctattct 


2340 
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ctagaaagta 


taggaacttc 


agagcgcttt 


tgaaaaccaa 


aagcgctctg 


aagacgcact 


2400 


ttcaaaaaac 


caaaaacgca 


ccggactgta 


acgagctact 


aaaatattgc 


gaataccgct 


2460 


tccacaaaca 


ttgctcaaaa 


gtatctcttt 


gctatatatc 


tctgtgctat 


atccctatat 


2520 


aacctaccca 


tccacctttc 


gctccttgaa 


cttgcatcta 


aactcgacct 


ctacatcaac 


2580 


aggcttccaa 


tgctcttcaa 


attttactgt 


caagtagacc 


catacggctg 


taatatgctg 


2640 


ctcttcataa 


tgtaagctta 


tctttatcga 


atcgtgtgaa 


aaactactac 


cgcgataaac 


2700 


ctttacggtt 


ccctgagatt 


gaattagttc 


ctttagtata 


tgatacaaga 


cacttttgaa 


2760 


ctttgtacga 


cgaattttga 


ggttcgccat 


cctctggcta 


tttccaatta 


tcctgtcggc 


2820 


tattatctcc 


gcctcagttt 


gatcttccgc 


ttcagactgc 


catttttcac 


ataatgaatc 


2880 


tatttcaccc 


cacaatcctt 


catccgcctc 


cgcatcttgt 


tccgttaaac 


tattgacttc 


2940 


atgttgtaca 


ttgtttagtt 


cacgagaagg 


gtcctcttca 


ggcggtagct 


cctgatctcc 


3000 


tatatgacct 


ttatcctgtt 


ctctttccac 


aaacttagaa 


atgtattcat 


gaattatgga 


3060 


gcacctaata 


acattcttca 


aggcggagaa 


gtttgggcca 


gatgcccaat 


atgcttgaca 


3120 


tgaaaacgtg 


agaatgaatt 


tagtattatt 


gtgatattct 


gaggcaattt 


tattataatc 


3180 


tcgaagataa 


gagaagaatg 


cagtgacctt 


tgtattgaca 


aatggagatt 


ccatgtatct 


3240 


aaaaaatacg 


cctttaggcc 


ttctgatacc 


ctttcccctg 


cggtttagcg 


tgccttttac 


3300 


attaatatct 


aaaccctctc 


cgatggtggc 


ctttaactga 


ctaataaatg 


caaccgatat 


3360 


aaactgtgat 


aattctgggt 


gatttatgat 


tcgatcgaca 


attgtattgt 


acactagtgc 


3420 


aggatcaggc 


caatccagtt 


ctttttcaat 


taccggtgtg 


tcgtctgtat 


tcagtacatg 


3480 


tccaacaaat 


gcaaatgcta 


acgttttgta 


tttcttataa 


ttgtcaggaa 


ctggaaaagt 


3540 


cccccttgtc 


gtctcgatta 


cacacctact 


ttcatcgtac 


accataggtt 


ggaagtgctg 


3600 


cataatacat 


tgcttaatac 


aagcaagcag 


tctctcgcca 


ttcatatttc 


agttattttc 


3660 


cattacagct 


gatgtcattg 


tatatcagcg 


ctgtaaaaat 


ctatctgtta 


cagaaggttt 


3720 


tcgcggtttt 


tataaacaaa 


actttcgtta 


cgaaatcgag 


caatcacccc 


agctgcgtat 


3780 


ttggaaattc 


gggaaaaagt 


agagcaacgc 


gagttgcatt 


ttttacacca 


taatgcatga 


3840 


ttaacttcga 


gaagggatta 


aggctaattt 


cactagtatg 


tttcaaaaac 


ctcaatctgt 


3900 


ccattgaatg 


ccttataaaa 


cagctataga 


ttgcatagaa 


gagttagcta 


ctcaatgctt 


3960 


tttgtcaaag 


cttactgatg 


atgatgtgtc 


tactttcagg 


cgggtctgta 


gtaaggagaa 


4020 


tgacattata 


aagctggcac 


ttagaattcc 


acggactata 


gactatacta 


gtatactccg 


4080 


tctactgtac 


gatacacttc 


cgctcaggtc 


cttgtccttt 


aacgaggcct 


taccactctt 


4140 
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ttgttactct 


attgatccag 


ctcagcaaag 


gcagtgtgat 


ctaagattct 


atcttcgcga 


4200 


tgtagtaaaa 


ctagctagac 


cgagaaagag 


actagaaatg 


caaaaggcac 


ttctacaatg 


4260 


gctgccatca 


ttattatccg 


atgtgacgct 


gcattttttt 


tttttttttt 


tttttttttt 


4320 


tttttttttt 


tttttttttt 


ttttttggta 


caaatatcat 


aaaaaaagag 


aatcttttta 


4380 


agcaaggatt 


ttcttaactt 


cttcggcgac 


agcatcaccg 


acttcggtgg 


tactgttgga 


4440 


accacctaaa 


tcaccagttc 


tgatacctgc 


atccaaaacc 


tttttaactg 


catcttcaat 


4500 


ggctttacct 


tcttcaggca 


agttcaatga 


caatttcaac 


atcattgcag 


cagacaagat 


4560 


agtggcgata 


gggttgacct 


tattctttgg 


caaatctgga 


gcggaaccat 


ggcatggttc 


4620 


gtacaaacca 


aatgcggtgt 


tcttgtctgg 


caaagaggcc 


aaggacgcag 


atggcaacaa 


4680 


acccaaggag 


cctgggataa 


cggaggcttc 


atcggagatg 


atatcaccaa 


acatgttgct 


4740 


ggtgattata 


ataccattta 


ggtgggttgg 


gttcttaact 


aggatcatgg 


cggcagaatc 


4600 


aatcaattga 


tgttgaactt 


tcaatgtagg 


gaattcgttc 


ttgatggttt 


cctccacagt 


4660 


ttttctccat 


aatcttgaag 


aggccaaaac 


attagcttta 


tccaaggacc 


aaataggcaa 


4920 


tggtggctca 


tgttgtaggg 


ccatgaaagc 


ggccattctt 


gtgattcttt 


gcacttctgg 


4980 


aacggtgtat 


tgttcactat 


cccaagcgac 


accatcacca 


tcgtcttcct 


ttctcttacc 


5040 


aaagtaaata 


cctcccacta 


attctctaac 


aacaacgaag 


tcagtacctt 


tagcaaattg 


5100 


tggcttgatt 


ggagataagt 


ctaaaagaga 


gtcggatgca 


aagttacatg 


gtcttaagtt 


5160 


ggcgtacaat 


tgaagttctt 


tacggatttt 


tagtaaacct 


tgttcaggtc 


taacactacc 


5220 


ggtaccccat 


ttaggaccac 


ccacagcacc 


taacaaaacg 


gcatcagcct 


tcttggaggc 


5280 


ttccagcgcc 


tcatctggaa 


gtggaacacc 


tgtagcatcg 


atagcagcac 


caccaattaa 


5340 


atgattttcg 


aaatcgaact 


tgacattgga 


acgaacatca 


gaaatagctt 


taagaacctt 


5400 


aatggcttcg 


gctgtgattt 


cttgaccaac 


gtggtcacct 


ggcaaaacga 


cgatcttctt 


5460 


^9999Ccigcic 


attacaatgg 


tatatccttg 


aaatatatat 


aaaaaaaaaa 


aaaaaaaaaa 


5520 


aaaaaaaaaa 


atgcagcttc 


tcaatgatat 


tcgaatacgc 


tttgaggaga 


tacagcctaa 


5580 


tatccgacaa 


actgttttac 


agatttacga 


tcgtacttgt 


tacccatcat 


tgaattttga 


5640 


acatccgaac 


ctgggagttt 


tccctgaaac 


agatagtata 


tttgaacctg 


tataataata 


5700 


tatagtctag 


cgctttacgg 


aagacaatgt 


atgtatttcg 


gttcctggag 


aaactattgc 


5760 


atctattgca 


taggtaatct 


tgcacgtcgc 


atccccggtt 


cattttctgc 


gtttccatct 


5820 


tgcacttcaa 


tagcatatct 


ttgttaacga 


agcatctgtg 


cttcattttg 


tagaacaaaa 


5880 


atgcaacgcg 


agagcgctaa 


tttttcaaac 


aaagaatctg 


agctgcattt 


ttacagaaca 


5940 
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gaaatgcaac 


gcgaaagcgc 


tattttacca 


acgaagaatc 


tgtgcttcat 


ttttgtaaaa 


6000 


caaaaatgca 


acgcgagagc 


gctaattttt 


caaacaaaga 


atctgagctg 


catttttaca 


6060 


gaacagaaat 


gcaacgcgag 


agcgctattt 


taccaacaaa 


gaatctatac 


ttcttttttg 


6120 


ttctacaaaa 


atgcatcccg 


agagcgctat 


ttttctaaca 


aagcatctta 


gattactttt 


6180 


tttctccttt 


gtgcgctcta 


taatgcagtc 


tcttgataac 


tttttgcact 


gtaggtccgt 


6240 


taaggttaga 


agaaggctac 


tttggtgtct 


attttctctt 


ccataaaaaa 


agcctgactc 


6300 


cacttcccgc 


gtttactgat 


tactagcgaa 


gctgcgggtg 


cattttttca 


agataaaggc 


6360 


atccccgatt 


atattctata 


ccgatgtgga 


ttgcgcatac 


tttgtgaaca 


gaaagtgata 


6420 


gcgttgatga 


ttcttcattg 


gtcagaaaat 


tatgaacggt 


ttcttctatt 


ttgtctctat 


6480 


atactacgta 


taggaaatgt 


ttacattttc 


gtattgtttt 


cgattcactc 


tatgaatagt 


6540 


tcttactaca 


atttttttgt 


ctaaagagta 


atactagaga 


taaacataaa 


aaatgtagag 


6600 


gtcgagttta 


gatgcaagtt 


caaggagcga 


aaggtggatg 


ggtaggttat 


atagggatat 


6660 


agcacagaga 


tatatagcaa 


agagatactt 


ttgagcaatg 


tttgtggaag 


cggtattcgc 


6720 


aatattttag 


tagctcgtta 


cagtccggtg 


cgtttttggt 


tttttgaaag 


tgcgtcttca 


6780 


gagcgctttt 


ggttttcaaa 


agcgctctga 


agttcctata 


ctttctagag 


aataggaact 


6840 


tcggaatagg 


aacttcaaag 


cgtttccgaa 


aacgagcgct 


tccgaaaatg 


caacgcgagc 


6900 


tgcgcacata 


cagctcactg 


ttcacgtcgc 


acctatatct 


gcgtgttgcc 


tgtatatata 


6960 


tatacatgag 


aagaacggca 


tagtgcgtgt 


ttatgcttaa 


atgcgtactt 


atatgcgtct 


7020 


atttatgtag 


gatgaaaggt 


agtctagtac 


ctcctgtgat 


attatcccat 


tccatgcggg 


7080 


gtatcgtatg 


cttccttcag 


cactaccctt 


tagctgttct 


atatgctgcc 


actcctcaat 


7140 


tggattagtc 


tcatccttca 


atgctatcat 


ttcctttgat 


attggatcat 


atgcatagta 


7200 


ccgagaaact 


agtgcgaagt 


agtgatcagg 


tattgctgtt 


atctgatgag 


tatacgttgt 


7260 


cctggccacg 


gcagaagcac 


gcttatcgct 


ccaatttccc 


acaacattag 


tcaactccgt 


7320 


taggcccttc 


attgaaagaa 


atgaggtcat 


caaatgtctt 


ccaatgtgag 


attttgggcc 


7380 


attttttata 


gcaaagattg 


aataaggcgc 


atttttcttc 


aaagctttat 


tgtacgatct 


7440 


gactaagtta 


tcttttaata 


attggtattc 


ctgtttattg 


cttgaagaat 


tgccggtcct 


7500 


atttactcgt 


tttaggactg 


gttcagaatt 


cctcaaaaat 


tcatccaaat 


atacaagtgg 


7560 


atcgatgata 


agctgtcaaa 


catgagaatt 


cttgaagacg 


aaagggcctc 


gtgatacgcc 


7620 


tatttttata 


ggttaatgtc 


atgataataa 


tggtttctta 


gacgtcaggt 


ggcacttttc 


7680 


ggggaaatgt 


gcgcggaacc 


cctatttgtt 


tatttttcta 


aatacattca 


aatatgtatc 


7740 
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cgctcatgag 


acaataaccc 


tgataaatgc 


ttcaataata 


ttgaaaaagg 


aagagtatga 


7800 


gtattcaaca 


tttccgtgtc 


gcccttattc 


ccttttttgc 


ggcattttgc 


cttcctgttt 


7860 


ttgctcaccc 


agaaacgctg 


gtgaaagtaa 


aagatgctga 


agatcagttg 


ggtgcacgag 


7920 


tgggttacat 


cgaactggat 


ctcaacagcg 


gtaagatcct 


tgagagtttt 


cgccccgaag 


7980 


aacgttttcc 


aatgatgagc 


acttttaaag 


ttctgctatg 


tggcgcggta 


ttatcccgtg 


8040 


ttgacgccgg 


gcaagagcaa 


ctcggtcgcc 


gcatacacta 


ttctcagaat 


gacttggttg 


8100 


agtactcacc 


agtcacagaa 


aagcatctta 


cggatggcat 


gacagtaaga 


gaattatgca 


8160 


gtgctgccat 


aaccatgagt 


gataacactg 


cggccaactt 


acttctgaca 


acgatcggag 


8220 


gaccgaagga 


gctaaccgct 


tttttgcaca 


acatggggga 


tcatgtaact 


cgccttgatc 


8280 


gttgggaacc 


ggagctgaat 


gaagccatac 


caaacgacga 


gcgtgacacc 


acgatgcctg 


8340 


cagcaatggc 


aacaacgttg 


cgcaaactat 


taactggcga 


actacttact 


ctagcttccc 


8400 


ggcaacaatt 


aatagactgg 


atggaggcgg 


ataaagttgc 


aggaccactt 


ctgcgctcgg 


8460 


cccttccggc 


tggctggttt 


attgctgata 


aatctggagc 


cggtgagcgt 


gggtctcgcg 


8520 


gtatcattgc 


agcactgggg 


ccagatggta 


agccctcccg 


tatcgtagtt 


atctacacga 


8580 


c^9999a9tca 


ggcaactatg 


gatgaacgaa 


atagacagat 


cgctgagata 


ggtgcctcac 


8640 


tgattaagca 


ttggtaactg 


tcagaccaag 


tttactcata 


tatactttag 


attgatttaa 


8700 


aacttcattt 


ttaatttaaa 


aggatctagg 


tgaagatcct 


ttttgataat 


ctcatgacca 


8760 


aaatccctta 


acgtgagttt 


tcgttccact 


gagcgtcaga 


ccccgtagaa 


aagatcaaag 


8820 


gatcttcttg 


agatcctttt 


tttctgcgcg 


taatctgctg 


cttgcaaaca 


aaaaaaccac 


8880 


cgctaccagc 


ggtggtttgt 


ttgccggatc 


aagagctacc 


aactcttttt 


ccgaaggtaa 


8940 


ctggcttcag 


cagagcgcag 


ataccaaata 


ctgtccttct 


agtgtagccg 


tagttaggcc 


9000 


accacttcaa 


gaactctgta 


gcaccgccta 


catacctcgc 


tctgctaatc 


ctgttaccag 


9060 


tggctgctgc 


cagtggcgat 


aagtcgtgtc 


ttaccgggtt 


ggactcaaga 


cgatagttac 


9120 


cggataaggc 


gcagcggtcg 


ggctgaacgg 


ggggttcgtg 


cacacagccc 


agcttggagc 


9180 


gaacgaccta 


caccgaactg 


agatacctac 


agcgtgagct 


atgagaaagc 


gccacgcttc 


9240 


ccgaagggag 


aaaggcggac 


aggtatccgg 


taagcggcag 


ggtcggaaca 


ggagagcgca 


9300 


cgagggagct 


tccaggggga 


aacgcctggt 


acctttatag 


tcctgtcggg 


tttcgccacc 


9360 


tctgacttga 


gcgtcgattt 


ttgtgatgct 


cgtcaggggg 


gcggagccta 


tggaaaaacg 


9420 


ccagcaacgc 


ggccttttta 


cggttcctgg 


ccttttgctg 


gccttttgct 


cacatgttct 


9480 


ttcctgcgtt 


atcccctgat 


tctgtggata 


accgtattac 


cgcctttgag 


tgagctgata 


9540 
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ccgctcgccg 


cagccgaacg 


accgagcgca 


gcgagtcagt 


gagcgaggaa 


gcggaagagc 


9600 


gcctgatgcg 


gtattttctc 


cttacgcatc 


tgtgcggtat 


ttcacaccgc 


atatggtgca 


9660 


ctctcagtac 


aatctgctct 


gatgccgcat 


agttaagcca 


gtatacactc 


cgctatcgct 


9720 


acgtgactgg 


gtcatggctg 


cgccccgaca 


cccgccaaca 


cccgctgacg 


cgccctgacg 


9780 


ggcttgtctg 


ctcccggcat 


ccgcttacag 


acaagctgtg 


accgtctccg 


ggagctgcat 


9840 


gtgtcagagg 


ttttcaccgt 


catcaccgaa 


acgcgcgagg 


cagctgcggt 


aaagctcatc 


9900 


agcgtggtcg 


tgaagcgatt 


cacagatgtc 


tgcctgttca 


tccgcgtcca 


gctcgttgag 


9960 


tttctccaga 


agcgttaatg 


tctggcttct 


gataaagcgg 


gccatgttaa 


gggcggtttt 


10020 


ttcctgtttg 


gtcactgatg 


cctccgtgta 


agggggattt 


ctgttcatgg 


gggtaatgat 


10080 


accgatgaaa 


cgagagagga 


tgctcacgat 


acgggttact 


gatgatgaac 


atgcccggtt 


10140 


actggaacgt 


tgtgagggta 


aacaactggc 


ggtatggatg 


cggcgggacc 


agagaaaaat 


10200 


cactcagggt 


caatgccagc 


gcttcgttaa 


tacagatgta 


ggtgttccac 


agggtagcca 


10260 


gcagcatcct 


gcgatgcaga 


tccggaacat 


aatggtgcag 


ggcgctgact 


tccgcgtttc 


10320 


cagactttac 


gaaacacgga 


aaccgaagac 


cattcatgtt 


gttgctcagg 


tcgcagacgt 


10380 


tttgcagcag 


cagtcgcttc 


acgttcgctc 


gcgtatcggt 


gattcattct 


gctaaccagt 


10440 


aaggcaaccc 


cgccagccta 


gccgggtcct 


caacgacagg 


agcacgatca 


tgcgcacccg 


10500 


tggccaggac 


ccaacgctgc 


ccgagatgcg 


ccgcgtgcgg 


ctgctggaga 


tggcggacgc 


10560 


gatggatatg 


ttctgccaag 


ggttggtttg 


cgcattcaca 


gttctccgca 


agaattgatt 


10620 


ggctccaatt 


cttggagtgg 


tgaatccgtt 


agcgaggtgc 


cgccggcttc 


cattcaggtc 


10680 


gaggtggccc 


ggctccatgc 


accgcgacgc 


aacgcgggga 


ggcagacaag 


gtatagggcg 


10740 


gcgcctacaa 


tccatgccaa 


cccgttccat 


gtgctcgccg 


aggcggcata 


aatcgccgtg 


10800 


acgatcagcg 


gtccaatgat 


cgaagttagg 


ctggtaagag 


ccgcgagcga 


tccttgaagc 


10860 


tgtccctgat 


ggtcgtcatc 


tacctgcctg 


gacagcatgg 


cctgcaacgc 


gggcatcccg 


10920 


atgccgccgg 


aagcgagaag 


aatcataatg 


gggaaggcca 


tccagcctcg 


cgtcgcgaac 


10980 


gccagcaaga 


cgtagcccag 


cgcgtcggcc 


gccatgccgg 


cgataatggc 


ctgcttctcg 


11040 


ccgaaacgtt 


tggtggcggg 


accagtgacg 


aaggcttgag 


cgagggcgtg 


caagattccg 


11100 


aataccgcaa 


gcgacaggcc 


gatcatcgtc 


gcgctccagc 


gaaagcggtc 


ctcgccgaaa 


11160 


atgacccaga 


gcgctgccgg 


cacctgtcct 


acgagttgca 


tgataaagaa 


gacagtcata 


11220 


agtgcggcga 


cgatagtcat 


gccccgcgcc 


caccggaagg 


agctgactgg 


gttgaaggct 


11280 


ctcaagggca 


tcggtcgagg 


atccttcaat 


atgcgcacat 


acgctgttat 


gttcaaggtc 


11340 
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ccttcgttta 


agaacgaaag 


cggtcttcct 


tttgagggat 


gtttcaagtt 


gttcaaatct 


11400 


atcaaatttg 


caaatcccca 


gtctgtatct 


^gagcgttga 


atcggtgatg 


cgatttgtta 


11460 


attaaattga 


tggtgtcacc 


attaccaggt 


ctagatatac 


caatggcaaa 


ctgagcacaa 


11520 


caataccagt 


ccggatcaac 


tggcaccatc 


tctcccgtag 


tctcatctaa 


tttttcttcc 


11580 


ggatgaggtt 


ccagatatac 


cgcaacacct 


ttattatggt 


ttccctgagg 


gaataataga 


11640 


atgtcccatt 


cgaaatcacc 


aattctaaac 


ctgggcgaat 


tgtatttcgg 


gtttgttaac 


11700 


tcgttccagt 


caggaatgtt 


ccacgtgaag 


ctatcttcca 


gcaaagtctc 


cacttcttca 


11760 


tcaaattgtg 


gagaatactc 


ccaatgctct 


tatctatggg 


acttccggga 


aacacagtac 


11820 


cgatacttcc 


caattcgtct 


tcagagctca 


ttgtttgttt 


gaagagacta 


atcaaagaat 


11880 


cgttttctca 


aaaaaattaa 


tatcttaact 


gatagtttga 


tcaaaggggc 


aaaacgtagg 


11940 


ggcaaacaaa 


cggaaaaatc 


gtttctcaaa 


ttttctgatg 


ccaagaactc 


taaccagtct 


12000 


tatctaaaaa 


ttgccttatg 


atccgtctct 


ccggttacag 


cctgtgtaac 


tgattaatcc 


12060 


tgcctttcta 


atcaccattc 


taatgtttta 


attaagggat 


tttgtcttca 


ttaacggctt 


12120 


tcgctcataa 


aaatgttatg 


acgttttgcc 


cgcaggcggg 


aaaccatcca 


cttcacgaga 


12180 


ctgatctcct 


ctgccggaac 


accgggcatc 


tccaacttat 


aagttggaga 


aataagagaa 


12240 


tttcagattg 


agagaatgaa 


aaaaaaaaac 


ccttagttca 


taggtccatt 


ctcttagcgc 


12300 


aactacagag 


aacaggggca 


caaacaggca 


aaaaacgggc 


acaacctcaa 


tggagtgatg 


12360 


caacctgcct 


ggagtaaatg 


atgacacaag 


gcaattgacc 


cacgcatgta 


tctatctcat 


12420 


tttcttacac 


cttctattac 


cttctgctct 


ctctgatttg 


gaaaaagctg 


aaaaaaaagg 


12480 


ttgaaaccag 


ttccctgaaa 


ttattcccct 


acttgactaa 


taagtatata 


aagacggtag 


12540 


gtattgattg 


taattctgta 


aatctatttc 


ttaaacttct 


taaattctac 


ttttatagtt 


12600 


agtctttttt 


ttagttttaa 


aacaccaaga 


acttagtttc 


gaataaacac 


acataaacaa 


12660 


acaagcttac 


aaaacaaa atg get gca tat gca get 
Met Ala Ala Tyr Ala Ala 


cag ggc tat aag gtg 
Gin Gly Tyr Lys Val 


12711 



15 10 

eta gta etc aac ccc tct gtt get gca aca ctg ggc ttt ggt get tac 12759 
Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
15 20 25 

atg tec aag get eat ggg ate gat cot aac ate agg acc ggg gtg aga 12807 
Met Ser Lys Ala His Gly lie Asp Pro Asn He Arg Thr Gly Val Arg 
30 35 40 

aca att acc act ggc age cee ate aeg tae tec acc tac ggc aag ttc 12855 
Thr lie Thr Thr Gly Ser Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe 
45 50 55 
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ctt gcc gac ggc ggg tgc teg ggg ggc get tat gac ata ata att tgt 12903 
Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp lie lie He Cys 
60 65 70 75 

gac gag tge eae tec acg gat gee aea tec ate ttg ggc att ggc act 12951 
Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu Gly He Gly Thr 
80 85 90 

gtc ctt gac caa gea gag act gcg ggg gcg aga ctg gtt gtg etc gcc 12999 
Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
95 100 105 

ace gcc ace cct ecg ggc tec gtc act gtg cce cat cec aac ate gag 13047 
Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu 
110 115 120 

gag gtt get ctg tec ace ace gga gag ate cet ttt tac ggc aag get 13095 
Glu Val Ala Leu Ser Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala 
125 130 135 

ate cce etc gaa gta ate aag ggg ggg aga cat etc ate ttc tgt eat 13143 
He Pro Leu Glu Val He Lys Gly Gly Arg His Leu He Phe Cys His 
140 145 150 155 

tea aag aag aag tgc gac gaa etc gcc gca aag ctg gtc gea ttg ggc 13191 
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly 
160 165 170 

ate aat gcc gtg gcc tac tac cgc ggt ctt gac gtg tec gtc ate ecg 13239 
He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
175 180 185 

ace age ggc gat gtt gtc gtc gtg gca ace gat gee etc atg ace ggc 13287 
Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
190 195 200 

tat ace ggc gac ttc gac teg gtg ata gac tgc aat aeg tgt gtc ace 13335 
Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
205 210 215 

cag aea gtc gat ttc age ctt gac cct ace ttc ace att gag aea ate 13383 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr He 
220 225 230 235 

aeg etc cce caa gat get gtc tee cgc act caa egt egg ggc agg act 13431 
Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr 
240 245 250 

ggc agg ggg aag cca ggc ate tac aga ttt gtg gca ecg ggg gag cgc 13479 
Gly Arg Gly Lys Pro Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg 
255 260 265 

cce tec ggc atg ttc gac teg tec gtc etc tgt gag tgc tat gac gca 13527 
Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
270 275 280 

ggc tgt get tgg tat gag etc aeg cce gee gag act aea gtt agg eta 13575 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu 
285 290 295 
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cga gcg tac atg aac acc ccg ggg ctt ccc gtg tgc cag gac cat ctt 13623 
Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
300 305 310 315 

gaa ttt tgg gag ggc gtc ttt aca ggc etc act cat ata gat gcc cac 13671 
Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His lie Asp Ala His 
320 325 330 

ttt eta tec eag aca aag cag agt ggg gag aac ctt cct tac ctg gta 13719 
Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val 
335 340 345 

gcg tac caa gcc acc gtg tgc get agg get caa gcc cct ccc cea teg 13767 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
350 355 360 

tgg gac cag atg tgg aag tgt ttg att cgc etc aag ccc ace etc cat 13 815 
Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His 
365 370 375 

ggg cea aca ccc ctg eta tac aga ctg ggc get gtt cag aat gaa ate 13863 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie 
380 385 390 395 

acc ctg acg cac cea gtc acc aaa tac ate atg aca tgc atg teg gee 13911 
Thr Leu Thr His Pro Val Thr Lys Tyr lie Met Thr Cys Met Ser Ala 
400 405 410 

gac ctg gag gtc gtc acg age acc tgg gtg etc gtt ggc ggc gtc ctg 13959 
Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu 
415 420 425 

get get ttg gcc gcg tat tgc ctg tea aca ggc tgc gtg gtc ata gtg 14007 
Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val He Val 
430 435 440 

ggc agg gtc gtc ttg tec ggg aag ccg gca ate ata cct gac agg gaa 14055 
Gly Arg Val Val Leu Ser Gly Lys Pro Ala He He Pro Asp Arg Glu 
445 450 455 

gtc etc tac cga gag ttc gat gag atg gaa gag tgc tct cag cac tta 14103 
Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu 
460 465 470 475 

ccg tac ate gag caa ggg atg atg etc gcc gag eag ttc aag cag aag 14151 
Pro Tyr lie Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys 
480 485 490 

gcc etc ggc etc ctg cag ace gcg tec cgt cag gca gag gtt ate gcc 14199 
Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala 
495 500 505 

cct get gtc cag acc aac tgg caa aaa etc gag acc ttc tgg gcg aag 14247 
Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys 
510 515 520 

eat atg tgg aac ttc ate agt ggg ata caa tac ttg gcg ggc ttg tea 14295 
His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser 
525 530 535 
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acg ctg cct ggt aac ccc gcc att get tea ttg atg get ttt aca get 14343 
Thr Leu Pro Qly Asn Pro Ala lie Ala Ser Leu Met Ala Phe Thr Ala 
540 545 550 555 

get gtc acc age cca eta ace aet age eaa ace etc etc tte aae ata 14391 
Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He 
560 565 570 

ttg ggg ggg tgg gtg get gee eag ete gee gee cee ggt gee get act 14439 
Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr 
575 580 585 

gcc ttt gtg ggc get ggc tta get ggc gcc gcc ate ggc agt gtt gga 14487 
Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly 
590 595 600 

ctg ggg aag gtc etc ata gac ate ctt gea ggg tat ggc geg ggc gtg 14535 
Leu Gly Lys Val Leu He Asp He Leu Ala Gly Tyr Gly Ala Gly Val 
605 610 615 

geg gga get ctt gtg gea tte aag ate atg age ggt gag gtc ccc tec 14583 
Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu Val Pro Ser 
620 625 630 635 

acg gag gac ctg gtc aat eta ctg ccc gcc ate etc teg ccc gga gcc 14631 
Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala 
640 645 650 

etc gta gtc ggc gtg gtc tgt gea gea ata ctg cgc egg cac gtt ggc 14679 
Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His Val Gly 
655 660 665 

ccg ggc gag ggg gea gtg eag tgg atg aae egg ctg ata gcc tte gee 14727 
Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala Phe Ala 
670 675 680 

tec egg ggg aac cat gtt tec ccc acg cac tac gtg ccg gag age gat 14775 
Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp 
685 690 695 

gea get gee ege gtc act gee ata etc age age etc act gta ace eag 14823 
Ala Ala Ala Arg Val Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin 
700 705 710 715 

etc ctg agg cga ctg cac eag tgg ata age teg gag tgt ace aet cca 14871 
Leu Leu Arg Arg Leu His Gin Trp He Ser Ser Glu Cys Thr Thr Pro 
720 725 730 

tgc tec ggt tec tgg eta agg gac ate tgg gac tgg ata tgc gag gtg 14919 
Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp He Cys Glu Val 
735 740 745 

ttg age gac ttt aag aec tgg eta aaa get aag etc atg eea eag ctg 14967 
Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu 
750 755 760 

cct ggg ate ccc ttt gtg tec tgc eag cgc ggg tat aag ggg gtc tgg 15015 
Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp 
765 770 775 
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cga ggg gac ggc ate atg cac act cgc tgc cac tgt gga get gag ate 15063 
Arg Gly Asp Gly lie Met His Thr Arg Cys His Cys Gly Ala Glu lie 
780 785 790 795 

act gga cat gtc aaa aac ggg acg atg agg ate gte ggt cct agg ace 15111 
Thr Gly His Val Lys Asn Gly Thr Met Arg lie Val Gly Pro Arg Thr 
800 805 810 

tgc agg aac atg tgg agt ggg ace ttc ccc att aat gcc tac acc acg 15159 
Cys Arg Asn Met Trp Ser Gly Thr Phe Pro lie Asn Ala Tyr Thr Thr 
815 820 825 

ggc ccc tgt acc ccc ctt cct gcg ccg aac tac acg ttc gcg eta tgg 15207 
Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp 
830 835 840 

agg gtg tct gca gag gaa tac gtg gag ata agg cag gtg ggg gac ttc 15255 
Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe 
845 850 855 

cac tac gtg acg ggt atg act act gac aat ctt aaa tgc ccg tgc cag 15303 
His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin 
860 865 870 875 

gtc cca teg ccc gaa ttt ttc aca gaa ttg gac ggg gtg cgc eta cat 15351 
Val Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His 
880 885 890 

agg ttt gcg ccc cec tgc aag ccc ttg ctg egg gag gag gta tea ttc 15399 
Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe 
895 900 905 

aga gta gga etc cac gaa tac ccg gta ggg teg caa tta cct tgc gag 15447 
Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu 
910 915 920 

ccc gaa ccg gac gtg gcc gtg ttg acg tec atg etc act gat ccc tec 15495 
Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser 
925 930 935 

cat ata aca gca gag gcg gcc ggg cga agg ttg gcg agg gga tea ccc 15543 
His He Thr Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro 
940 945 950 955 

CCC tct gtg gcc age tec teg get age cag eta tec get cca tct etc 15591 
Pro Ser Val Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu 
960 965 970 

aag gca act tgc acc get aac cat gac tec cct gat get gag etc ata 15639 
Lys Ala Thr Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu He 
975 980 985 

gag gcc aac etc eta tgg agg cag gag atg ggc ggc aac ate acc agg 15687 
Glu Ala Asn Leu Leu Trp Pkxg Gin Glu Met Gly Gly Asn lie Thr Arg 
990 995 1000 

gtt gag tea gaa aac aaa gtg gtg att ctg gac tec ttc gat ccg ctt 15735 
Val Glu Ser Glu Asn Lys Val Val He Leu Asp Ser Phe Asp Pro Leu 
1005 1010 1015 
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gtg gcg gag gag gac gag egg gag ate tec gta ccc gca gaa ate ctg 
Val Ala Glu Glu Asp Glu Arg Glu lie Ser Val Pro Ala Glu He Leu 
1020 1025 1030 1035 



15783 



egg aag tct egg aga ttc gee eag gee ctg cee gtt tgg gcg egg eeg 
Arg Lys Ser Arg Arg Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro 
1040 1045 1050 



15831 



gae tat aae cee ccg eta gtg gag acg tgg aaa aag cee gae tae gaa 
Asp Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu 
1055 1060 1065 



15879 



eea eet gtg gtc cat ggc tgc ccg ctt eca cct cea aag tec cet cct 
Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro 
1070 1075 1080 



15927 



gtg cct ccg cet egg aag aag egg acg gtg gtc etc act gaa tea ace 
Val Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr 
1085 1090 1095 



15975 



eta tct act gee ttg gee gag etc gee ace aga age ttt ggc age tec 
Leu Ser Thr Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser 
1100 1105 1110 1115 



16023 



tea act tec ggc att acg ggc gac aat acg aca aca tec tct gag ccc 
Ser Thr Ser Gly He Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro 
1120 1125 1130 



16071 



gee cct tct ggc tgc ccc ccc gac tec gac get gag tec tat tec tee 
Ala Pro Ser Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser 
1135 1140 1145 



16119 



atg ccc ccc ctg gag ggg gag eet ggg gat ccg gat ctt age gac ggg 
Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly 
1150 1155 1160 



16167 



tea tgg tea acg gtc agt agt gag gee aac gcg gag gat gtc gtg tgc 
Ser Trp Ser Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys 
1165 1170 1175 



16215 



tgc tea atg tct tae tct tgg aca ggc gca etc gtc ace ccg tgc gee 
Cys Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala 
1180 1185 1190 1195 



16263 



gcg gaa gaa eag aaa ctg cee ate aat gca eta age aae teg ttg eta 
Ala Glu Glu Gin Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu 
1200 1205 1210 



16311 



cgt cac cac aat ttg gtg tat tec acc acc tea cgc agt get tgc caa 
Arg His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin 
1215 1220 1225 



16359 



agg eag aag aaa gtc aca ttt gac aga ctg caa gtt ctg gac age cat 
Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His 
1230 1235 1240 



16407 



tae eag gac gta etc aag gag gtt aaa gca gcg gcg tea aaa gtg aag 
Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys 
1245 1250 1255 



16455 
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get aac ttg eta tee gta gag gaa get tge age ctg acg cee cca eac 16503 
Ala Ash Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His 
1260 1265 1270 1275 

tea gee aaa tee aag ttt ggt tat ggg gea aaa gae gte cgt tge cat 16551 
Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His 
1280 1285 1290 

gee aga aag gee gta ace eac ate aae tec gtg tgg aaa gae ett ctg 16599 
Ala Arg Lys Ala Val Thr His lie Asn Ser Val Trp Lys Asp Leu Leu 
1295 1300 1305 

gaa gae aat gta aca cca ata gae act ace ate atg get aag aac gag 16647 
Glu Asp Asn Val Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu 
1310 1315 1320 

gtt ttc tge gtt eag eet gag aag ggg ggt cgt aag cea get cgt etc 16695 
Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu 
1325 1330 1335 

ate gtg ttc cee gat ctg ggc gtg ege gtg tge gaa aag atg get ttg 16743 
He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu 
1340 1345 1350 1355 

tac gae gtg gtt aca aag etc cec ttg gee gtg atg gga age tee tac 16791 
Tyr Asp Val Val Thr Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr 
1360 1365 1370 

gga tte eaa tac tea cca gga eag egg gtt gaa ttc etc gtg eaa geg 16839 
Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala 
1375 1380 1385 

tgg aag tec aag aaa acc cca atg ggg ttc teg tat gat ace ege tge 16887 
Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys 
1390 1395 1400 

ttt gae tec aca gte act gag age gae ate cgt acg gag gag gea ate 16935 
Phe Asp Ser Thr Val Thr Glu Ser Asp lie Arg Thr Glu Glu Ala He 
1405 1410 1415 

tac eaa tgt tgt gae etc gae cee eaa gee ege gtg gee ate aag tec 16983 
Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala He Lys Ser 
1420 1425 1430 1435 

etc acc gag agg ett tat gtt ggg ggc cct ett ace aat tea agg ggg 17031 
Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly 
1440 1445 1450 

gag aac tge ggc tat cgc agg tge ege geg age ggc gta ctg aca act 17079 
Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr 
1455 1460 1465 

age tgt ggt aac acc etc act tge tac ate aag gee egg gea gee tgt 17127 
Ser Cys Gly Asn Thr Leu Thr Cys Tyr He Lys Ala Arg Ala Ala Cys 
1470 1475 1480 

cga gee gea ggg etc eag gae tge acc atg etc gtg tgt ggc gae gae 17175 
Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp 
1485 1490 1495 
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tta gtc gtt ate tgt gaa age geg ggg gtc cag gag gac gcg gcg age 17223 
Leu Val Val lie Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser 
1500 1505 1510 1515 

ctg aga gee ttc acg gag get atg ace agg tac tec gee ccc cct ggg 17271 
Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly 
1520 1525 1530 

gac cce eea caa cea gaa tae gae ttg gag etc ata aca tea tgc tec 17319 
Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser 
1535 1540 1545 

tec aac gtg tea gtc gee cae gae gge get gga aag agg gtc tac tac 17367 
Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr 
1550 1555 1560 

etc ace egt gae cct aca ace ccc etc gcg aga get geg tgg gag aca 17415 
Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr 
1565 1570 1575 

gca aga cac act cca gtc aat tec tgg eta gge aac ata ate atg ttt 17463 
Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn lie He Met Phe 
1580 1585 1590 1595 

gee ccc aca ctg tgg gcg agg atg ata ctg atg acc cat ttc ttt age 17511 
Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His Phe Phe Ser 
1600 1605 1610 

gtc ctt ata gee agg gac cag ctt gaa cag gcc etc gat tgc gag ate 17559 
Val Leu He Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He 
1615 1620 1625 

tac ggg gcc tgc tac tec ata gaa cca ctg gat eta cct cca ate att 17607 
Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Pro He He 
1630 1635 1640 

caa aga etc cat gge etc age gca ttt tea etc cac agt tac tct cca 17655 
Gin Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro 
1645 1650 1655 

ggt gaa ate aat agg gtg gcc gca tgc etc aga aaa ctt ggg gta ccg 17703 
Gly Glu He Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro 
1660 1665 1670 1675 

CCC ttg ega get tgg aga cac egg gcc egg age gtc cgc get agg ctt 17751 
Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu 
1680 1685 1690 

Ctg gcc aga gga gge agg get gcc ata tgt gge aag tac etc ttc aac 17799 
Leu Ala Arg Gly Gly Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn 
1695 1700 1705 

tgg gca gta aga aca aag etc aaa etc act cca ata gcg gcc get gge 17847 
Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly 
1710 1715 1720 

cag ctg gac ttg tec gge tgg ttc acg get gge tac age ggg gga gac 17895 
Gin Leu Asp Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp 
1725 1730 1735 
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att tat cac age gtg tct cat gcc egg ccc cgc tgg ate tgg ttt tgc 17943 
lie Tyr His Ser Val Ser His Ala Arg Pro Arg Trp lie Trp Phe Cys 
1740 1745 1750 1755 

eta etc ctg ctt get gca ggg gta gge ate tac etc etc ccc aae cga 17991 
Leu Leu Leu Leu Ala Ala Gly Val Gly lie Tyr Leu Leu Pro Asn Arg 
1760 1765 1770 

atg age acg aat ect aaa cct caa aga aag ace aaa egt aac acc aac 18039 
Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1775 1780 1785 

egg egg ccg cag gac gtc aag ttc ccg ggt gge ggt cag ate gtt ggt 18087 
Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
1790 1795 1800 

gga gtt tac ttg ttg ccg ege agg gge cct aga ttg ggt gtg cgc geg 18135 
Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
1805 1810 1815 

acg aga aag act tec gag egg teg caa cct cga ggt aga egt cag cct 18183 
Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
1820 1825 1830 1835 

ate ccc aag get egt egg ccc gag gge agg ace tgg get cag cec ggg 18231 
He Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
1840 1845 1850 

tac cct tgg ccc etc tat gge aat gag gge tgc ggg tgg geg gga tgg 18279 
Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
1855 1860 1865 

etc ctg tct ccc egt gge tct egg cct age tgg gge ccc aca gac ccc 18327 
Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
1870 1875 1880 



egg egt agg teg cgc aat ttg ggt aag 
Arg Arg Arg Ser Arg Asn Leu Gly Lys 
1885 1890 

gge ttc gee gac etc atg ggg tac ata 
Gly Phe Ala Asp Leu Met Gly Tyr He 
1900 1905 



gte ate gat acc ctt acg tgc 18375 
Val He Asp Thr Leu Thr Cys 
1895 

ccg etc gtc taatagtcga 18421 
Pro Leu Val 
1910 



ctttgttccc actgtacttt tagctegtae aaaatacaat ataettttca tttcteegta 18481 
aacaacatgt ttteeeatgt aatateettt tctattttte gttcegttae caactttaea 18541 
eatactttat atagctattc acttctatae actaaaaaac taagacaatt ttaattttgc 18601 
tgcctgeeat atttcaattt gttataaatt cetataattt atcctattag tagctaaaaa 18661 
aagatgaatg tgaategaat cctaagagaa ttggatetga tceacaggac gggtgtggtc 18721 
gceatgateg cgtagtcgat agtggeteca agtagegaag egageaggac tgggcggcgg 18781 
ecaaageggt cggacagtgc tccgagaaeg ggtgegcata gaaattgeat caacgeatat 18841 
agcgetagea gcacgeeata gtgaetggcg atgctgtcgg aatggaegat ateecgcaag 18901 
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aggcccggca 


gtaccggcat 


aaccaagcct 


atgcctacag 


catccagggt 


gacggtgccg 


18961 


aggatgacga 


tgagcgcatt 


gttagatttc 


atacacggtg 


cctgactgcg 


ttagcaattt 


19021 


aactgtgata 


aactaccgca 


ttaaagcttt 


ttctttccaa 


tttttttttt 


ttcgtcatta 


19081 


taaaaatcat 


tacgaccgag 


attcccgggt 


aataactgat 


ataattaaat 


tgaagctcta 


19141 


atttgtgagt 


ttagtataca 


tgcatttact 


tataatacag 


ttttttagtt 


ttgctggccg 


19201 


catcttctca 


aatatgcttc 


ccagcctgct 


tttctgtaac 


gttcaccctc 


taccttagca 


19261 


tcccttccct 


ttgcaaatag 


tcctcttcca 


acaataataa 


tgtcagatcc 


tgtagagacc 


19321 


acatcatcca 


cggttctata 


ctgttgaccc 


aatgcgtctc 


ccttgtcatc 


taaacccaca 


19381 


ccgggtgtca 


taatcaacca 


atcgtaacct 


tcatctcttc 


cacccatgtc 


tctttgagca 


19441 


ataaagccga 


taacaaaatc 


tttgtcgctc 


ttcgcaatgt 


caacagtacc 


cttagtatat 


19501 


tctccagtag 


atagggagcc 


cttgcatgac 


aattctgcta 


acatcaaaag 


gcctctaggt 


19561 


tcctttgtta 


cttcttctgc 


cgcctgcttc 


aaaccgctaa 


caatacctgg 


gcccaccaca 


19621 


ccgtgtgcat 


tcgtaatgtc 


tgcccattct 


gctattctgt 


atacacccgc 


agagtactgc 


19681 


aatttgactg 


tattaccaat 


gtcagcaaat 


tttctgtctt 


cgaagagtaa 


aaaattgtac 


19741 


ttggcggata 


atgcctttag 


cggcttaact 


gtgccctcca 


tggaaaaatc 


agtcaagata 


19801 


tccacatgtg 


tttttagtaa 


acaaattttg 


ggacctaatg 


cttcaactaa 


ctccagtaat 


19861 


tccttggtgg 


tacgaacatc 


caatgaagca 


cacaagtttg 


tttgcttttc 


gtgcatgata 


19921 


ttaaatagct 


tggcagcaac 


aggactagga 


tgagtagcag 


cacgttcctt 


atatgtagct 


19981 


ttcgacatga 


tttatcttcg 


tttcctgcag 


gtttttgttc 


tgtgcagttg 


ggttaagaat 


20041 


actgggcaat 


ttcatgtttc 


ttcaacacta 


catatgcgta 


tatataccaa 


tctaagtctg 


20101 


tgctccttcc 


ttcgttcttc 


cttctgttcg 


gagattaccg 


aatcaaaaaa 


atttcaagga 


20161 


aaccgaaatc 


aaaaaaaaga 


ataaaaaaaa 


aatgatgaat 


tgaaaagctt 


atcgat 


20217 



<210> 17 
<211> 1911 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
pd. delta. NS3NS5,pj .corel40 

<400> 17 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
15 10 15 
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Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His 
20 25 30 

Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr Thr Gly 
35 40 45 

Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly 
50 55 60 

Cys Ser Gly Gly Ala Tyr Asp lie lie lie Cys Asp Glu Cys His Ser 
65 70 75 80 

Thr Asp Ala Thr Ser lie Leu Gly lie Gly Thr Val Leu Asp Gin Ala 
85 90 95 

Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro 
100 105 110 

Gly Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser 
115 120 125 

Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val 
130 135 140 

He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys 
145 150 155 160 

Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala 
165 170 175 

Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val 
180 185 190 

Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe 
195 200 205 

Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe 
210 215 220 

Ser Leu Asp Pro Thr Phe Thr He Glu Thr He Thr Leu Pro Gin Asp 
225 230 235 240 

Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro 
245 250 255 

Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe 
260 265 270 

Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr 
275 280 285 

Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn 
290 295 300 

Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly 
305 310 315 320 



Val Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr 
325 330 335 
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Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr 
340 345 350 

Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp 
355 360 365 

Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu 
370 375 380 

Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu lie Thr Leu Thr His Pro 
385 390 395 400 

Val Thr Lys Tyr He Met Thr Cys Met Ser Ala Asp Leu Glu Val Val 
405 410 415 

Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
420 425 430 

Tyr Cys Leu Ser Thr Gly Cys Val Val He Val Gly Arg Val Val Leu 
435 440 445 

Ser Gly Lys Pro Ala He He Pro Asp Arg Glu Val Leu Tyr Arg Glu 
450 455 460 

Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin 
465 470 475 480 

Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu 
485 490 495 

Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala Val Gin Thr 
500 505 510 

Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe 
515 520 525 

He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 
530 535 540 

Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro 
545 550 555 560 

Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val 
565 570 575 

Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 
580 585 590 

Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly Leu Gly Lys Val Leu 
595 600 605 

He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 
610 615 620 

Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 
625 630 635 640 



Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val 
645 650 655 
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Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
660 665 670 

Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 
675 680 685 

Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
690 695 700 

Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu 
705 710 715 720 

His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp 
725 730 735 

Leu Arg Asp He Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys 
740 745 750 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe 
755 760 765 

Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He 
770 775 780 

Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His Val Lys 
785 790 795 800 

Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
805 810 815 

Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
820 825 830 

Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser Ala Glu 
835 840 845 

Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe His Tyr Val Thr Gly 
850 855 860 

Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser Pro Glu 
865 870 875 880 

Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro 
885 890 895 

Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His 
900 905 910 

Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val 
915 920 925 

Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu 
930 935 940 

Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala Ser 
945 950 955 960 

Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 
965 970 975 
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Ala Asn His Asp Ser Pro Asp Ala Glu Leu lie Glu Ala Asn Leu Leu 
980 985 990 

Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val Glu Ser Glu Asn 
995 1000 1005 

Lys Val Val lie Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp 
1010 1015 1020 

Glu Arg Glu lie Ser Val Pro Ala Glu lie Leu Arg Lys Ser Arg Arg 
025 1030 1035 1040 

Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro 
1045 1050 1055 

Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His 
1060 1065 1070 

Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro Pro Arg 
1075 1080 1085 

Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu 
1090 1095 1100 

Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser Gly lie 
105 1110 1115 1120 

Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys 
1125 1130 1135 

Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu 
1140 1145 1150 

Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp . Ser Thr Val 
1155 1160 1165 

Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Met Ser Tyr 
1170 1175 1180 

Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys 
185 1190 1195 1200 

Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu 
1205 1210 1215 

Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val 
1220 1225 1230 

Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu 
1235 1240 1245 

Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser 
1250 1255 1260 

Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys 
265 1270 1275 1280 

Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val 
1285 1290 1295 
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Thr His lie Asn Ser Val Trp Lys Asp Leu Leu Olu Asp Asn Val Thr 
1300 1305 1310 

Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin 
1315 1320 1325 

Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp 
1330 1335 1340 

Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr 
345 1350 1355 1360 

Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser 
1365 1370 1375 

Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys 
1380 1385 1390 

Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 
1395 1400 1405 

Thr Glu Ser Asp lie Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp 
1410 1415 1420 

Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser Leu Thr Glu Arg Leu 
425 1430 1435 1440 

Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr 
1445 1450 1455 

Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr 
1460 1465 1470 

Leu Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu 
1475 1480 1485 

Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val lie Cys 
1490 1495 1500 

Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr 
505 1510 1515 1520 



Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro 
1525 1530 1535 

Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val 
1545 1550 



Glu Ala Met Thr 



Glu Tyr Asp Leu 
1540 

Ala His Asp Gly Ala Gly Lys Arg 
1555 1560 

Thr Thr Pro Leu Ala Arg Ala Ala 
1570 1575 

Val Asn Ser Trp Leu Gly Asn lie 
585 1590 

Ala Arg Met He Leu Met Thr His 
1605 



Val Tyr Tyr Leu Thr Arg Asp Pro 
1565 

Trp Glu Thr Ala Arg His Thr Pro 
1580 

He Met Phe Ala Pro Thr Leu Trp 
1595 1600 

Phe Phe Ser Val Leu He Ala Arg 

1610 1615 

/ 
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Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu lie Tyr Gly Ala Cys Tyr 
1620 , 1625 1630 

Ser lie Glu Pro Leu Asp Leu Pro Pro lie lie Gin Arg Leu His Gly 
1635 1640 1645 

Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu lie Asn Arg 
1650 1655 1660 

Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp 
665 1670 1675 1680 

Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly 
1685 1690 1695 

Arg Ala Ala lie Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr 
1700 1705 1710 

Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly Gin Leu Asp Leu Ser 
1715 1720 1725 

Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val 
1730 1735 1740 

Ser His Ala Arg Pro Arg Trp He Trp Phe Cys Leu Leu Leu Leu Ala 
745 1750 1755 1760 

Ala Gly Val Gly He Tyr Leu Leu Pro Asn Arg Met Ser Thr Asn Pro 
1765 1770 1775 

Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp 
1780 1785 1790 

Val Lys Phe Pro Gly Gly Gly Gin He Val Gly Gly Val Tyr Leu Leu 
1795 1800 1805 

Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser 
1810 1815 1820 

Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro He Pro Lys Ala Arg 
825 1830 1835 1840 

Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly Tyr Pro Trp Pro Leu 
^ 1845 1850 1855 

Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg 
1860 1865 1870 

Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg 
1875 1880 1885 

Asn Leu Gly Lys Val He Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
1890 1895 1900 

Met Gly Tyr He Pro Leu Val 
905 1910 



<210> 18 
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<211> 20247 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
pd.delta.NS3NS5.pj .corelSO 

<220> 

<221> CDS 

<222> (12679) . . (18441) 

<400> 18 



atcgatccta 


ccccttgcgc 


taaagaagta 


tatgtgccta 


ctaacgcttg 


tctttgtctc 


60 


tgtcactaaa 


cactggatta 


ttactcccag 


atacttattt 


tggactaatt 


taaatgattt 


120 


cggatcaacg 


ttcttaatat 


cgctgaatct 


tccacaattg 


atgaaagtag 


ctaggaagag 


180 


gaattggtat 


aaagtttttg 


tttttgtaaa 


tctcgaagta 


tactcaaacg 


aatttagtat 


240 


tttctcagtg 


atctcccaga 


tgctttcacc 


ctcacttaga 


agtgctttaa 


gcattttttt 


300 


actgtggcta 


tttcccttat 


ctgcttcttc 


cgatgattcg 


aactgtaatt 


gcaaactact 


360 


tacaatatca 


gtgatatcag 


attgatgttt 


ttgtccatag 


taaggaataa 


ttgtaaattc 


420 


ccaagcagga 


atcaatttct 


ttaatgaggc 


ttccagaatt 


gttgcttttt 


gcgtcttgta 


480 


tttaaactgg 


agtgatttat 


tgacaatatc 


gaaactcagc 


gaattgctta 


tgatagtatt 


540 


atagctcatg 


aatgtggctc 


tcttgattgc 


tgttccgtta 


tgtgtaatca 


tccaacataa 


600 


ataggttagt 


tcagcagcac 


ataatgctat 


tttctcacct 


gaaggtcttt 


caaacctttc 


660 


cacaaactga 


cgaacaagca 


ccttaggtgg 


tgttttacat 


aatatatcaa 


attgtggcat 


720 






^^ClClUL.^OiL.Cl 




actactctat 


ttatcttgta 


•7 p 


tcttgcagta 


ttcaaacacg 


ctaactcgaa 


aaactaactt 


taattgtcct gtttgtctcg 


840 


cgttctttcg 


aaaaatgcac 


cggccgcgca 


ttatttgtac 


tgcgaaaata attggtactg 


900 


cggtatcttc 


atttcatatt 


ttaaaaatgc 


acctttgctg 


cttttcctta 


atttttagac 


960 


ggcccgcagg 


ttcgttttgc 


ggtactatct 


tgtgataaaa 


agttgttttg acatgtgatc 


1020 


tgcacagatt 


ttataatgta 


ataagcaaga 


atacattatc 


aaacgaacaa 


tactggtaaa 


1080 


agaaaaccaa 


aatggacgac 


attgaaacag 


ccaagaatct 


gacggtaaaa gcacgtacag 


1140 


cttatagcgt 


ctgggatgta 


tgtcggctgt 


ttattgaaat 


gattgctcct 


gatgtagata 


1200 


ttgatataga 


gagtaaacgt 


aagtctgatg 


agctactctt 


tccaggatat gtcataaggc 


1260 


ccatggaatc 


tctcacaacc 


ggtaggccgt 


atggtcttga 


ttctagcgca gaagattcca 


1320 


gcgtatcttc 


tgactccagt 


gctgaggtaa 


ttttgcctgc 


tgcgaagatg gttaaggaaa 


1380 
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ggtttgattc 


gattggaaat 


ggtatgctct 


cttcacaaga 


agcaagtcag 


gctgccatag 


1440 


atttgatgct 


acagaataac 


aagctgttag 


acaatagaaa 


gcaactatac 


aaatctattg 


1500 


ctataataat 


aggaagattg 


cccgagaaag 


acaagaagag 


agctaccgaa 


atgctcatga 


1560 


gaaaaatgga 


ttgtacacag 


ttattagtcc 


caccagctcc 


aacggaagaa 


gatgttatga 


1620 


agctcgtaag 


cgtcgttacc 


caattgctta 


ctttagttcc 


accagatcgt 


caagctgctt 


1680 


taataggtga 


tttattcatc 


ccggaatctc 


taaaggatat 


attcaatagt 


ttcaatgaac 


1740 


tggcggcaga 


gaatcgttta 


cagcaaaaaa 


agagtgagtt 


ggaaggaagg 


actgaagtga 


1800 


accatgctaa 


tacaaatgaa 


gaagttccct 


ccaggcgaac 


aagaagtaga 


gacacaaatg 


1860 


caagaggagc 


atataaatta 


caaaacacca 


tcactgaggg 


ccctaaagcg 


gttcccacga 


1920 


aaaaaaggag 


agtagcaacg 


agggtaaggg 


gcagaaaatc 


acgtaatact 


tctagggtat 


1980 


gatccaatat 


caaaggaaat 


gatagcattg 


aaggatgaga 


ctaatccaat 


tgaggagtgg 


2040 


cagcatatag 


aacagctaaa 


gggtagtgct 


gaaggaagca 


tacgataccc 


cgcatggaat 


2100 


gggataatat 


cacaggaggt 


actagactac 


ctttcatcct 


acataaatag 


acgcatataa 


2160 


gtacgcattt 


aagcataaac 


acgcactatg 


ccgttcttct 


catgtatata 


tatatacagg 


2220 


caacacgcag 


atataggtgc 


gacgtgaaca 


gtgagctgta 


tgtgcgcagc 


tcgcgttgca 


2280 


ttttcggaag 


cgctcgtttt 


cggaaacgct 


ttgaagttcc 


tattccgaag 


ttcctattct 


2340 


ctagaaagta 


taggaacttc 


agagcgcttt 


tgaaaaccaa 


aagcgctctg 


aagacgcact 


2400 


ttcaaaaaac 


caaaaacgca 


ccggactgta 


acgagctact 


aaaatattgc 


gaataccgct 


2460 


tccacaaaca 


ttgctcaaaa 


gtatctcttt 


gctatatatc 


tctgtgctat 


atccctatat 


2520 


aacctaccca 


tccacctttc 


gctccttgaa 


cttgcatcta 


aactcgacct 


ctacatcaac 


2580 


aggcttccaa 


tgctcttcaa 


attttactgt 


caagtagacc 


catacggctg 


taatatgctg 


2640 


ctcttcataa 


tgtaagctta 


tctttatcga 


atcgtgtgaa 


aaactactac 


cgcgataaac 


2700 


ctttacggtt 


ccctgagatt 


gaattagttc 


ctttagtata 


tgatacaaga 


cacttttgaa 


2760 


ctttgtacga 


cgaattttga 


ggttcgccat 


cctctggcta 


tttccaatta 


tcctgtcggc 


2820 


tattatctcc 


gcctcagttt 


gatcttccgc 


ttcagactgc 


catttttcac 


ataatgaatc 


2880 


tatttcaccc 


cacaatcctt 


catccgcctc 


cgcatcttgt 


tccgttaaac 


tattgacttc 


2940 


atgttgtaca 


ttgtttagtt 


cacgagaagg 


gtcctcttca 


ggcggtagct 


cctgatctcc 


3000 


tatatgacct 


ttatcctgtt 


ctctttccac 


aaacttagaa 


atgtattcat 


gaattatgga 


3060 


gcacctaata 


acattcttca 


aggcggagaa 


gtttgggcca 


gatgcccaat 


atgcttgaca 


3120 


tgaaaacgtg 


agaatgaatt 


tagtattatt 


gtgatattct 


gaggcaattt 


tattataatc 


3180 
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tcgaagataa 


gagaagaatg 


cagtgacctt 


tgtattgaca 


aatggagatt 


ccatgtatct 


3240 


aaaaaatacg 


cctttaggcc 


ttctgatacc 


ctttcccctg 


cggtttagcg 


tgccttttac 


3300 


attaatatct 


aaaccctctc 


cgatggtggc 


ctttaactga 


ctaataaatg 


caaccgatat 


3360 


aaactgtgat 


aattctgggt 


gatttatgat 


tcgatcgaca 


attgtattgt 


acactagtgc 


3420 


aggatcaggc 


caatccagtt 


ctttttcaat 


taccggtgtg 


tcgtctgtat 


tcagtacatg 


3480 


tccaacaaat 


gcaaatgcta 


acgttttgta 


tttcttataa 


ttgtcaggaa 


ctggaaaagt 


3540 


cccccttgtc 


gtctcgatta 


cacacctact 


ttcatcgtac 


accataggtt 


ggaagtgctg 


3600 


cataatacat 


tgcttaatac 


aagcaagcag 


tctctcgcca 


ttcatatttc 


agttattttc 


3660 


cattacagct 


gatgtcattg 


tatatcagcg 


ctgtaaaaat 


ctatctgtta 


cagaaggttt 


3720 


tcgcggtttt 


tataaacaaa 


actttcgtta 


cgaaatcgag 


caatcacccc 


agctgcgtat 


3780 


ttggaaattc 


gggaaaaagt 


agagcaacgc 


gagttgcatt 


ttttacacca 


taatgcatga 


3840 


ttaacttcga 


gaagggatta 


aggctaattt 


cactagtatg 


tttcaaaaac 


ctcaatctgt 


3900 


ccattgaatg 


ccttataaaa 


cagctataga 


ttgcatagaa 


gagttagcta 


ctcaatgctt 


3960 


tttgtcaaag 


cttactgatg 


atgatgtgtc 


tactttcagg 


cgggtctgta 


gtaaggagaa 


4020 


tgacattata 


aagctggcac 


ttagaattcc 


acggactata 


gactatacta 


gtatactccg 


4080 


tctactgtac 


gatacacttc 


cgctcaggtc 


cttgtccttt 


aacgaggcct 


taccactctt 


4140 


ttgttactct 


attgatccag 


ctcagcaaag 


gcagtgtgat 


ctaagattct 


atcttcgcga 


4200 


tgtagtaaaa 


ctagctagac 


cgagaaagag 


actagaaatg 


caaaaggcac 


ttctacaatg 


4260 


gctgccatca 


ttattatccg 


atgtgacgct 


gcattttttt 


tttttttttt 


tttttttttt 


4320 


tttttttttt 


tttttttttt 


ttttttggta 


caaatatcat 


aaaaaaagag 


aatcttttta 


4380 


agcaaggatt 


ttcttaactt 


cttcggcgac 


agcatcaccg 


acttcggtgg 


tactgttgga 


4440 


accacctaaa 


tcaccagttc 


tgatacctgc 


atccaaaacc 


tttttaactg 


catcttcaat 


4500 


ggctttacct 


tcttcaggca 


agttcaatga 


caatttcaac 


atcattgcag 


cagacaagat 


4560 


agtggcgata 


gggttgacct 


tattctttgg 


caaatctgga 


gcggaaccat 


ggcatggttc 


4620 


gtacaaacca 


aatgcggtgt 


tcttgtctgg 


caaagaggcc 


aaggacgcag 


atggcaacaa 


4680 


acccaaggag 


cctgggataa 


cggaggcttc 


atcggagatg 


atatcaccaa 


acatgttgct 


4740 


ggtgattata 


ataccattta 


ggtgggttgg 


gttcttaact 


aggatcatgg 


cggcagaatc 


4800 


aatcaattga 


tgttgaactt 


tcaatgtagg 


gaattcgttc 


ttgatggttt 


cctccacagt 


4860 


ttttctccat 


aatcttgaag 


aggccaaaac 


attagcttta 


tccaaggacc 


aaataggcaa 


4920 


tggtggctca 


tgttgtaggg 


ccatgaaagc 


ggccattctt 


gtgattcttt 


gcacttctgg 


4980 
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aacggtgtat 


tgttcactat 


cccaagcgac 


accatcacca 


tcgtcttcct 


ttctcttacc 


5040 


aaagtaaata 


cctcccacta 


attctctaac 


aacaacgaag 


tcagtacctt 


tagcaaattg 


5100 


tggcttgatt 


ggagataagt 


ctaaaagaga 


gtcggatgca 


aagttacatg 


gtcttaagtt 


5160 


ggcgtacaat 


tgaagttctt 


tacggatttt 


tagtaaacct 


tgttcaggtc 


taacactacc 


5220 


ggtaccccat 


ttaggaccac 


ccacagcacc 


taacaaaacg 


gcatcagcct 


tcttggaggc 


5280 


ttccagcgcc 


tcatctggaa 


gtggaacacc 


tgtagcatcg 


atagcagcac 


caccaattaa 


5340 


atgattttcg 


aaatcgaact 


tgacattgga 


acgaacatca 


gaaatagctt 


taagaacctt 


5400 


aatggcttcg 


gctgtgattt 


cttgaccaac 


gtggtcacct 


ggcaaaacga 


cgatcttctt 


5460 


aggggcagac 


attacaatgg 


tatatccttg 


aaatatatat 


aaaaaaaaaa 


aaaaaaaaaa 


5520 


aaaaaaaaaa 


atgcagcttc 


tcaatgatat 


tcgaatacgc 


tttgaggaga 


tacagcctaa 


5580 


tatccgacaa 


actgttttac 


agatttacga 


tcgtacttgt 


tacccatcat 


tgaattttga 


5640 


acatccgaac 


ctgggagttt 


tccctgaaac 


agatagtata 


tttgaacctg 


tataataata 


5700 


tatagtctag 


cgctttacgg 


aagacaatgt 


atgtatttcg 


gttcctggag 


aaactattgc 


5760 


atctattgca 


taggtaatct 


tgcacgtcgc 


atccccggtt 


cattttctgc 


gtttccatct 


5820 


tgcacttcaa 


tagcatatct 


ttgttaacga 


agcatctgtg 


cttcattttg 


tagaacaaaa 


5880 


atgcaacgcg 


agagcgctaa 


tttttcaaac 


aaagaatctg 


agctgcattt 


ttacagaaca 


5940 


gaaatgcaac 


gcgaaagcgc 


tattttacca 


acgaagaatc 


tgtgcttcat 


ttttgtaaaa 


6000 


caaaaatgca 


acgcgagagc 


gctaattttt 


caaacaaaga 


atctgagctg 


catttttaca 


6060 


gaacagaaat 


gcaacgcgag 


agcgctattt 


taccaacaaa 


gaatctatac 


ttcttttttg 


6120 


ttctacaaaa 


atgcatcccg 


agagcgctat 


ttttctaaca 


aagcatctta 


gattactttt 


6180 


tttctccttt 


gtgcgctcta 


taatgcagtc 


tcttgataac 


tttttgcact 


gtaggtccgt 


6240 


taaggttaga 


agaaggctac 


tttggtgtct 


attttctctt 


ccataaaaaa 


agcctgactc 


6300 


cacttcccgc 


gtttactgat 


tactagcgaa 


gctgcgggtg 


cattttttca 


agataaaggc 


6360 


atccccgatt 


atattctata 


ccgatgtgga 


ttgcgcatac 


tttgtgaaca 


gaaagtgata 


6420 


gcgttgatga 


ttcttcattg 


gtcagaaaat 


tatgaacggt 


ttcttctatt 


ttgtctctat 


6480 


atactacgta 


taggaaatgt 


ttacattttc 


gtattgtttt 


cgattcactc 


tatgaatagt 


6540 


tcttactaca 


atttttttgt 


ctaaagagta 


atactagaga 


taaacataaa 


aaatgtagag 


6600 


gtcgagttta 


gatgcaagtt 


caaggagcga 


aaggtggatg 


ggtaggttat 


atagggatat 


6660 


agcacagaga 


tatatagcaa 


agagatactt 


ttgagcaatg 


tttgtggaag 


cggtattcgc 


6720 


aatattttag 


tagctcgtta 


cagtccggtg 


cgtttttggt 


tttttgaaag 


tgcgtcttca 


6780 
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gagcgctttt 


ggttttcaaa 


agcgctctga 


agttcctata 


ctttctagag 


aataggaact 


6840 


tcggaatagg 


aacttcaaag 


cgtttccgaa 


aacgagcgct 


tccgaaaatg 


caacgcgagc 


6900 


tgcgcacata 


cagctcactg 


ttcacgtcgc 


acctatatct 


gcgtgttgcc 


tgtatatata 


6960 


tatacatgag 


aagaacggca 


tagtgcgtgt 


ttatgcttaa 


atgcgtactt 


atatgcgtct 


7020 


atttatgtag 


gatgaaaggt 


agtctagtac 


ctcctgtgat 


attatcccat 


tccatgcggg 


7080 


gtatcgtatg 


cttccttcag 


cactaccctt 


tagctgttct 


atatgctgcc 


actcctcaat 


7140 


tggattagtc 


tcatccttca 


atgctatcat 


ttcctttgat 


attggatcat 


atgcatagta 


7200 


ccgagaaact 


agtgcgaagt 


agtgatcagg 


tattgctgtt 


atctgatgag 


tatacgttgt 


7260 


cctggccacg 


gcagaagcac 


gcttatcgct 


ccaatttccc 


acaacattag 


tcaactccgt 


7320 


taggcccttc 


attgaaagaa 


atgaggtcat 


caaatgtctt 


ccaatgtgag 


attttgggcc 


7380 


attttttata 


gcaaagattg 


aataaggcgc 


atttttcttc 


aaagctttat 


tgtacgatct 


7440 


gactaagtta 


tcttttaata 


attggtattc 


ctgtttattg 


cttgaagaat 


tgccggtcct 


7500 


atttactcgt 


tttaggactg 


gttcagaatt 


cctcaaaaat 


tcatccaaat 


atacaagtgg 


7560 


atcgatgata 


agctgtcaaa 


catgagaatt 


cttgaagacg 


aaagggcctc 


gtgatacgcc 


7620 


tatttttata 


ggttaatgtc 


atgataataa 


tggtttctta 


gacgtcaggt 


ggcacttttc 


7680 


ggggaaatgt 


gcgcggaacc 


cctatttgtt 


tatttttcta 


aatacattca 


aatatgtatc 


7740 


cgctcatgag 


acaataaccc 


tgataaatgc 


ttcaataata 


ttgaaaaagg 


aagagtatga 


7800 


gtattcaaca 


tttccgtgtc 


gcccttattc 


ccttttttgc 


ggcattttgc 


cttcctgttt 


7860 


ttgctcaccc 


agaaacgctg 


gtgaaagtaa 


aagatgctga 


agatcagttg 


ggtgcacgag 


7920 


tgggttacat 


cgaactggat 


ctcaacagcg 


gtaagatcct 


tgagagtttt 


cgccccgaag 


7980 


aacgttttcc 


aatgatgagc 


acttttaaag 


ttctgctatg 


tggcgcggta 


ttatcccgtg 


8040 


ttgacgccgg 


gcaagagcaa 


ctcggtcgcc 


gcatacacta 


ttctcagaat 


gacttggttg 


8100 


agtactcacc 


agtcacagaa 


aagcatctta 


cggatggcat 


gacagtaaga 


gaattatgca 


8160 


gtgctgccat 


aaccatgagt 


gataacactg 


cggccaactt 


acttctgaca 


acgatcggag 


8220 


gaccgaagga 


gctaaccgct 


tttttgcaca 


acatggggga 


tcatgtaact 


cgccttgatc 


8280 


gttgggaacc 


ggagctgaat 


gaagccatac 


caaacgacga 


gcgtgacacc 


acgatgcctg 


8340 


cagcaatggc 


aacaacgttg 


cgcaaactat 


taactggcga 


actacttact 


ctagcttccc 


8400 


ggcaacaatt 


aatagactgg 


atggaggcgg 


ataaagttgc 


aggaccactt 


ctgcgctcgg 


8460 


cccttccggc 


tggctggttt 


attgctgata 


aatctggagc 


cggtgagcgt 


gggtctcgcg 


8520 


gtatcattgc 


agcactgggg 


ccagatggta 


agccctcccg 


tatcgtagtt 


atctacacga 


8580 
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cggggagtca 


ggcaactatg 


gatgaacgaa 


atagacagat 


cgctgagata 


ggtgcctcac 


8640 


tgattaagca 


ttggtaactg 


tcagaccaag 


tttactcata 


tatactttag 


attgatttaa 


8700 


aacttcattt 


ttaatttaaa 


aggatctagg 


tgaagatcct 


ttttgataat 


ctcatgacca 


8760 


aaatccctta 


acgtgagttt 


tcgttccact 


gagcgtcaga 


ccccgtagaa 


aagatcaaag 


8820 


gatcttcttg 


agatcctttt 


tttctgcgcg 


taatctgctg 


cttgcaaaca 


aaaaaaccac 


8880 


cgctaccagc 


ggtggtttgt 


ttgccggatc 


aagagctacc 


aactcttttt 


ccgaaggtaa 


8940 


ctggcttcag 


cagagcgcag 


ataccaaata 


ctgtccttct 


agtgtagccg 


tagttaggcc 


9000 


accacttcaa 


gaactctgta 


gcaccgccta 


catacctcgc 


tctgctaatc 


ctgttaccag 


9060 


tggctgctgc 


cagtggcgat 


aagtcgtgtc 


ttaccgggtt 


ggactcaaga 


cgatagttac 


9120 


cggataaggc 


gcagcggtcg 


ggctgaacgg 


ggggttcgtg 


cacacagccc 


agcttggagc 


9180 


gaacgaccta 


caccgaactg 


agatacctac 


agcgtgagct 


atgagaaagc 


gccacgcttc 


9240 


ccgaagggag 


aaaggcggac 


aggtatccgg 


taagcggcag 


ggtcggaaca 


ggagagcgca 


9300 


cgagggagct 


tccaggggga 


aacgcctggt 


atctttatag 


tcctgtcggg 


tttcgccacc 


9360 


tctgacttga 


gcgtcgattt 


ttgtgatgct 


cgtcaggggg 


gcggagccta 


tggaaaaacg 


9420 


ccagcaacgc 


ggccttttta 


cggttcctgg 


ccttttgctg 


gccttttgct 


cacatgttct 


9480 


ttcctgcgtt 


atcccctgat 


tctgtggata 


accgtattac 


cgcctttgag 


tgagctgata 


9540 


ccgctcgccg 


cagccgaacg 


accgagcgca 


gcgagtcagt 


gagcgaggaa 


gcggaagagc 


9600 


gcctgatgcg 


gtattttctc 


cttacgcatc 


tgtgcggtat 


ttcacaccgc 


atatggtgca 


9660 


ctctcagtac 


aatctgctct 


gatgccgcat 


agttaagcca 


gtatacactc 


cgctatcgct 


9720 


acgtgactgg 


gtcatggctg 


cgccccgaca 


cccgccaaca 


cccgctgacg 


cgccctgacg 


9780 


ggcttgtctg 


ctcccggcat 


ccgcttacag 


acaagctgtg 


accgtctccg 


ggagctgcat 


9840 


gtgtcagagg 


ttttcaccgt 


catcaccgaa 


acgcgcgagg 


cagctgcggt 


aaagctcatc 


9900 


agcgtggtcg 


tgaagcgatt 


cacagatgtc 


tgcctgttca 


tccgcgtcca 


gctcgttgag 


9960 


tttctccaga 


agcgttaatg 


tctggcttct 


gataaagcgg 


gccatgttaa 


gggcggtttt 


10020 


ttcctgtttg 


gtcactgatg 


cctccgtgta 


agggggattt 


ctgttcatgg 


gggtaatgat 


10080 


accgatgaaa 


cgagagagga 


tgctcacgat 


acgggttact 


gatgatgaac 


atgcccggtt 


X0140 


actggaacgt 


tgtgagggta 


aacaactggc 


ggtatggatg 


cggcgggacc 


agagaaaaat 


10200 


cactcagggt 


caatgccagc 


gcttcgttaa 


tacagatgta 


ggtgttccac 


agggtagcca 


10260 


gcagcatcct 


gcgatgcaga 


tccggaacat 


aatggtgcag 


ggcgctgact 


tccgcgtttc 


10320 


cagactttac 


gaaacacgga 


aaccgaagac 


cattcatgtt 


gttgctcagg 


tcgcagacgt 


10380 
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tttgcagcag 


cagtcgcttc 


acgttcgctc 


gcgtatcggt 


gattcattct 


gctaaccagt 


10440 


aaggcaaccc 


cgccagccta 


gccgggtcct 


caacgacagg 


agcacgatca 


tgcgcacccg 


10500 


tggccaggac 


ccaacgctgc 


ccgagatgcg 


ccgcgtgcgg 


ctgctggaga 


tggcggaegc 


10560 


gatggatatg 


ttctgccaag 


ggttggtttg 


cgcattcaca 


gttctccgca 


agaattgatt 


10620 


ggctccaatt 


cttggagtgg 


tgaatccgtt 


agcgaggtgc 


cgccggcttc 


cattcaggtc 


10680 


gaggtggccc 


ggctccatgc 


accgcgacgc 


aacgcgggga 


ggcagacaag 


gtatagggcg 


10740 


gcgcctacaa 


tccatgccaa 


cccgttccat 


gtgctcgccg 


aggcggcata 


aatcgccgtg 


10800 


acgatcagcg 


gtccaatgat 


cgaagttagg 


ctggtaagag 


ccgcgagcga 


tccttgaagc 


10860 


tgtccctgat 


ggtcgtcatc 


tacctgcctg 


gacagcatgg 


cctgcaacgc 


gggcatcccg 


10920 


atgccgccgg 


aagcgagaag 


aatcataatg 


gggaaggcca 


tec age c teg 


egtcgcgaac 


10980 


gccagcaaga 


cgtagcccag 


cgcgtcggcc 


gccatgccgg 


egataatggc 


etgcttctcg 


11040 


ccgaaacgtt 


tggtggcggg 


accagtgacg 


aaggcttgag 


cgagggcgtg 


caagattccg 


11100 


aataccgcaa 


gcgacaggcc 


gatcatcgtc 


gcgctccagc 


gaaagcggtc 


etcgccgaaa 


11160 


atgacccaga 


gcgctgccgg 


cacctgtcct 


acgagttgca 


tgataaagaa 


gacagtcata 


11220 


agtgcggcga 


cgatagtcat 


gccccgcgcc 


caccggaagg 


agctgactgg 


gttgaaggct 


11280 


ctcaagggca 


tcggtcgagg 


atccttcaat 


atgcgcacat 


acgctgttat 


gttcaaggtc 


11340 


ccttcgttta 


agaacgaaag 


cggtcttcct 


tttgagggat 


gttteaagtt 


gttcaaatet 


11400 


atcaaatttg 


caaatcccca 


gtctgtatct 


agagcgttga 


atcggtgatg 


cgatttgtta 


11460 


attaaattga 


tggtgtcacc 


attaccaggt 


ctagatatac 


caatggcaaa 


ctgagcacaa 


11520 


caataccagt 


ccggatcaac 


tggcaccatc 


tctcccgtag 


tctcatctaa 


tttttcttcc 


11580 


ggatgaggtt 


ccagatatac 


cgcaacacct 


ttattatggt 


ttccetgagg 


gaataataga 


11640 


atgtcccatt 


cgaaatcacc 


aattctaaac 


ctgggcgaat 


tgtatttcgg 


gtttgttaac 


11700 


tcgttccagt 


caggaatgtt 


ccacgtgaag 


ctatcttcca 


gcaaagtetc 


cacttcttca 


11760 


tcaaattgtg 


gagaatactc 


ccaatgctct 


tatctatggg 


acttccggga 


aacacagtac 


11820 


cgatacttcc 


caattcgtct 


tcagagctca 


ttgtttgttt 


gaagagacta 


atcaaagaat 


11880 


cgttttctca 


aaaaaattaa 


tatcttaact 


gatagtttga 


tcaaaggggc 


aaaacgtagg 


11940 


ggcaaacaaa 


cggaaaaatc 


gtttctcaaa 


ttttctgatg 


ccaagaaetc 


taaccagtct 


12000 


tatctaaaaa 


ttgccttatg 


atccgtctct 


ccggttacag 


cctgtgtaac 


tgattaatcc 


12060 


tgcctttcta 


atcaccattc 


taatgtttta 


attaagggat 


tttgtcttea 


ttaacggctt 


12120 


tcgctcataa 


aaatgttatg 


acgttttgcc 


cgcaggcggg 


aaaccatcca 


cttcacgaga 


12180 
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ctgatctcct ctgccggaac accgggcatc tccaacttat aagttggaga aataagagaa 12240 

tttcagattg agagaatgaa aaaaaaaaac ccttagttca taggtccatt ctcttagcgc 12300 

aactacagag aacaggggca caaacaggca aaaaacgggc acaacctcaa tggagtgatg 12360 

caacctgcct ggagtaaatg atgacacaag gcaattgacc cacgcatgta tctatctcat 12420 

tttcttacac cttctattac cttctgctct ctctgatttg gaaaaagctg aaaaaaaagg 12480 

ttgaaaccag ttccctgaaa ttattcccct acttgactaa taagtatata aagacggtag 12540 

gtattgattg taattctgta aatctatttc ttaaacttct taaattctac ttttatagtt 12600 

agtctttttt ttagttttaa aacaccaaga acttagtttc gaataaacac acataaacaa 12660 

acaagcttac aaaacaaa atg get gca tat gca get cag ggc tat aag gtg 12711 

Met Ala Ala Tyr Ala Ala, Gin Gly Tyr Lys Val 
15 10 

eta gta etc aac ccc tct gtt get gea aca ctg gge ttt ggt get tae 12759 
Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr 
15 20 25 

atg tec aag get eat ggg ate gat eet aae ate agg ace ggg gtg aga 12807 
Met Ser Lys Ala His Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg 
30 35 40 

aca att ace act gge age eec ate aeg tae tee aee tae gge aag tte 12855 
Thr lie Thr Thr Gly Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe 
45 50 55 

ctt gee gac gge ggg tgc teg ggg gge get tat gae ata ata att tgt 12903 
Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp lie lie lie Cya 
60 65 70 75 

gac gag tgc cac tec aeg gat gee aca tec ate ttg ggc att gge act 12951 
Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu Gly He Gly Thr 
80 85 90 

gtc ctt gac caa gca gag act gcg ggg gcg aga ctg gtt gtg etc gee 12999 
Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
95 100 105 

aee gee ace cct ccg ggc tec gtc act gtg ccc cat ccc aac ate gag 13047 
Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He Glu 
110 115 120 

gag gtt get ctg tec acc ace gga gag ate cct ttt tac gge aag get 13095 
Glu Val Ala Leu Ser Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala 
125 130 135 

ate ccc etc gaa gta ate aag ggg ggg aga cat etc ate ttc tgt eat 13143 
He Pro Leu Glu Val He Lys Gly Gly Arg His Leu He Phe Cys His 

140 145 150 155 

tea aag aag aag tgc gae gaa etc gee gea aag ctg gtc gca ttg ggc 13191 
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly 
160 165 170 
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ate aat gcc gtg gcc tac tac cgc ggt ctt gac gtg tec gtc ate ccg 13239 
lie Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val lie Pro 
175 180 185 

ace age ggc gat gtt gte gte gtg gea ace gat gcc etc atg ace gge 13287 
Thr Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly 
190 195 200 

tat ace ggc gac ttc gac teg gtg ata gac tgc aat aeg tgt gtc ace 13335 
Tyr Thr Gly Asp Phe Asp Ser Val lie Asp Cys Asn Thr Cys Val Thr 
205 210 215 

eag aca gte gat ttc age ctt gac cct ace ttc ace att gag aea ate 13383 
Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr lie Glu Thr lie 

220 225 230 235 

acg etc cec caa gat get gtc tec cgc act caa cgt egg gge agg act 13431 
Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr 
240 245 250 

ggc agg ggg aag cca ggc ate tac aga ttt gtg gca ccg ggg gag cgc 13479 
Gly Arg Gly Lys Pro Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg 
255 260 265 

ecc tee gge atg ttc gac teg tec gtc etc tgt gag tgc tat gac gca 13527 
Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala 
270 275 280 

ggc tgt get tgg tat gag etc acg cec gcc gag act aca gtt agg eta 13575 
Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu 
285 290 295 

cga gcg tac atg aae ace ccg ggg ctt ecc gtg tgc eag gac eat ctt 13623 
Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu 
300 305 310 315 

gaa ttt tgg gag ggc gtc ttt aca ggc etc act cat ata gat gcc cac 13671 
Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His He Asp Ala His 
320 325 330 

ttt eta tec eag aea aag eag agt ggg gag aae ctt cct tac ctg gta 13719 
Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val 
335 340 345 

gcg tac caa gee ace gtg tgc get agg get caa gee cct cec cca teg 13767 
Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser 
350 355 360 

tgg gac cag atg tgg aag tgt ttg att cgc etc aag cec acc etc cat 13815 
Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu His 
365 370 375 

ggg cca aca cec ctg eta tac aga ctg ggc get gtt cag aat gaa ate 13863 
Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu He 

380 385 390 395 

ace ctg acg cac cca gte ace aaa tac ate atg aea tgc atg teg gee 13911 
Thr Leu Thr His Pro Val Thr Lys Tyr He Met Thr Cys Met Ser Ala 
400 405 410 
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gac ctg gag gtc gtc acg age acc tgg gtg etc gtt ggc ggc gtc ctg 13959 

Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu 
415 420 425 

get get ttg gee geg tat tgc ctg tea aca ggc tgc gtg gte ata gtg 14007 

Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val lie Val 

430 435 440 

ggc agg gtc gtc ttg tec ggg aag ccg gca ate ata cct gac agg gaa 14055 

Gly Arg Val Val Leu Ser Gly Lys Pro Ala lie lie Pro Asp Arg Glu 
445 450 455 

gtc etc tac cga gag ttc gat gag atg gaa gag tgc tet cag cac tta 14103 

Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu 
460 465 470 475 

ccg tac ate gag caa ggg atg atg etc gee gag cag ttc aag cag aag 14151 

Pro Tyr lie Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys 
480 485 490 

gee etc ggc etc ctg cag acc geg tec cgt cag gca gag gtt ate gee 14199 

Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala 
495 500 505 

cct get gtc cag acc aac tgg caa aaa etc gag acc ttc tgg geg aag 14247 

Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys 

510 515 520 

cat atg tgg aac ttc ate agt ggg ata caa tac ttg geg ggc ttg tea 14295 

His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser 
525 530 535 

acg ctg cct ggt aac ccc gee att get tea ttg atg get ttt aca get 14343 

Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala 
540 545 550 555 

get gtc acc age cca eta acc act age caa acc etc etc ttc aac ata 14391 

Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He 
560 565 570 

ttg ggg ggg tgg gtg get gcc cag etc gee gcc ccc ggt gee get act 14439 

Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr 
575 580 585 

gcc ttt gtg ggc get ggc tta get ggc gcc gcc ate ggc agt gtt gga 14487 
Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly 

590 595 600 

ctg ggg aag gtc etc ata gac ate ctt gca ggg tat ggc geg ggc gtg 14535 

Leu Gly Lys Val Leu He Asp He Leu Ala Gly Tyr Gly Ala Gly Val 
605 610 615 

geg gga get ctt gtg gca ttc aag ate atg age ggt gag gtc ccc tec 14583 

Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu Val Pro Ser 

620 625 630 635 

acg gag gac ctg gtc aat eta ctg ccc gcc ate etc teg ccc gga gcc 14631 

Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala 
640 645 650 
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etc gta gtc ggc gtg gtc tgt gca gca ata ctg cgc egg cac gtt ggc 14679 
Leu Val Val Gly Val Val Cys Ala Ala lie Leu Arg Arg His Val Gly 
655 660 665 

ecg gge gag ggg gca gtg eag tgg atg aac egg ctg ata gee ttc gee 14727 
Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala Phe Ala 
670 675 680 

tec egg ggg aac cat gtt tec ccc aeg cac tac gtg ccg gag age gat 14775 
Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp 
685 690 695 

gca get gee cgc gtc act gee ata etc age age etc act gta ace eag 14823 
Ala Ala Ala Arg Val Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin 
700 705 710 715 

etc ctg agg cga ctg cac cag tgg ata age teg gag tgt ace act cea 14871 
Leu Leu Arg Arg Leu His Gin Trp He Ser Ser Glu Cys Thr Thr Pro 
720 725 730 

tgc tec ggt tec tgg eta agg gac ate tgg gac tgg ata tgc gag gtg 14 919 
Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp He Cys Glu Val 
735 740 745 

ttg age gac ttt aag ace tgg eta aaa get aag etc atg cea eag ctg 14967 
Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu 
750 755 760 

cct ggg ate ccc ttt gtg tec tgc cag cgc ggg tat aag ggg gtc tgg 15015 
Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp 
765 770 775 

cga ggg gac ggc ate atg cac act cgc tgc cac tgt gga get gag ate 15063 
Arg Gly Asp Gly He Met His Thr Arg Cys His Cys Gly Ala Glu He 
780 785 790 795 

act gga cat gtc aaa aac ggg aeg atg agg ate gtc ggt cct agg ace 15111 
Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly Pro Arg Thr 
800 805 810 

tgc agg aac atg tgg agt ggg ace ttc ccc att aat gee tac ace aeg 15159 
Cys Arg Asn Met Trp Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr 
815 820 825 

ggc ccc tgt acc ccc ett cct geg ccg aac tac aeg ttc geg eta tgg 15207 
Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp 
830 835 840 

agg gtg tct gca gag gaa tac gtg gag ata agg cag gtg ggg gac ttc 15255 
Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe 
845 850 855 

cac tac gtg aeg ggt atg act act gac aat ett aaa tgc ccg tgc cag 15303 
His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin 
860 865 870 875 

gtc cea teg ccc gaa ttt ttc aca gaa ttg gac ggg gtg cgc eta cat 15351 
Val Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His 
880 885 890 
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agg ttt gcg ccc ccc tgc aag ccc ttg ctg egg gag gag gta tea ttc 
Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe 
895 900 905 



15399 



aga gta gga etc cae gaa tac ccg gta ggg teg eaa tta cct tge gag 
Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu 
910 915 920 



15447 



cce gaa ccg gac gtg gcc gtg ttg acg tec atg etc act gat ccc tee 
Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser 
925 930 935 



15495 



cat ata aea gea gag gcg gee ggg ega agg ttg gcg agg gga tea eee 
His lie Thr Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro 
940 945 950 955 



15543 



cce tet gtg gcc age tec teg get age cag eta tec get eea tet etc 
Pro Ser Val Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu 
960 965 970 



15591 



aag gea act tge ace get aae cat gae tee cct gat get gag etc ata 
Lys Ala Thr Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu lie 
975 980 985 



15639 



gag gcc aae etc eta tgg agg cag gag atg ggc ggc aae ate ace agg 
Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg 
990 995 1000 



15687 



gtt gag tea gaa aac aaa gtg gtg att ctg gac tec ttc gat ccg ett 
Val Glu Ser Glu Asn Lys Val Val lie Leu Asp Ser Phe Asp Pro Leu 
1005 1010 1015 



15735 



gtg gcg gag gag gac gag egg gag ate tee gta eee gea gaa ate ctg 
Val Ala Glu Glu Asp Glu Arg Glu lie Ser Val Pro Ala Glu lie Leu 
1020 1025 1030 1035 



15783 



egg aag tet egg aga ttc gee cag gcc ctg cce gtt tgg gcg egg ccg 
Arg Lys Ser Arg Arg Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro 
1040 1045 1050 



15831 



gac tat aac cce ccg eta gtg gag acg tgg aaa aag cce gac tac gaa 
Asp Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu 
1055 1060 1065 



15879 



cea cct gtg gte eat ggc tgc ccg ett eea cct eea aag tec cct cct 
Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro 
1070 1075 1080 



15927 



gtg cct ccg cct egg aag aag egg acg gtg gtc etc act gaa tea ace 
Val Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr 
1085 1090 1095 



15975 



eta tet act gcc ttg gcc gag etc gee ace aga age ttt ggc age tec 
Leu Ser Thr Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser 
1100 1105 1110 1115 



16023 



tea act tec ggc att acg ggc gae aat acg aea aea tee tet gag ccc 
Ser Thr Ser Gly lie Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro 
1120 1125 1130 



16071 
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gcc cct tct ggc tgc ccc ccc gac tec gac get gag tec tat tee tee 
Ala Pro Ser Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser 
1135 1140 1145 



16119 



atg ccc ccc ctg gag ggg gag cct ggg gat ccg gat ctt age gac ggg 
Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly 
1150 1155 1160 



16167 



tea tgg tea acg gtc agt agt gag gcc aac gcg gag gat gtc gtg tgc 
Ser Trp Ser Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys 
1165 1170 1175 



16215 



tgc tea atg tct tac tct tgg aca ggc gea etc gtc ace ccg tgc gee 
Cys Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala 
1180 1185 1190 1195 



16263 



gcg gaa gaa cag aaa ctg ccc ate aat gea eta age aac teg ttg eta 
Ala Glu Glu Gin Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu 
1200 1205 1210 



16311 



cgt cac cac aat ttg gtg tat tec acc acc tea cgc agt get tgc caa 
Arg His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin 
1215 1220 1225 



16359 



agg cag aag aaa gtc aca ttt gac aga ctg caa gtt ctg gac age eat 
Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His 
1230 1235 1240 



16407 



tac cag gac gta etc aag gag gtt aaa gea gcg gcg tea aaa gtg aag 
Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys 
1245 1250 1255 



16455 



get aac ttg eta tee gta gag gaa get tgc age ctg acg ccc cea cac 
Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His 
1260 1265 1270 1275 



16503 



tea gcc aaa tec aag ttt ggt tat ggg gea aaa gac gtc cgt tgc cat 
Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His 
1280 1285 1290 



16551 



gee aga aag gcc gta acc cac ate aac tec gtg tgg aaa gac ctt ctg 
Ala Arg Lys Ala Val Thr His lie Asn Ser Val Trp Lys Asp Leu Leu 
1295 1300 1305 



16599 



gaa gac aat gta aca eca ata gac act ace ate atg get aag aac gag 
Glu Asp Asn Val Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu 
1310 1315 1320 



16647 



gtt ttc tgc gtt cag cct gag aag ggg ggt cgt aag cca get cgt etc 
Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu 
1325 1330 1335 



16695 



ate gtg ttc ccc gat ctg ggc gtg cgc gtg tgc gaa aag atg get ttg 
lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu 
1340 1345 1350 1355 



16743 



tac gac gtg gtt aca aag etc ccc ttg gcc gtg atg gga age tec tac 
Tyr Asp Val Val Thr Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr 
1360 1365 1370 



16791 
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gga ttc caa tac tea cca gga cag egg gtt gaa ttc ete gtg caa gcg 16839 
Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala 
1375 1380 1385 

tgg aag tec aag aaa aee cca atg ggg ttc teg tat gat ace cgc tgc 16887 
Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys 
1390 1395 1400 

ttt gac tec aca gtc act gag age gac ate egt acg gag gag gca ate 16935 
Phe Asp Ser Thr Val Thr Glu Ser Asp lie Arg Thr Glu Glu Ala lie 
1405 1410 1415 

tac caa tgt tgt gac etc gac cec caa gee cgc gtg gee ate aag tec 16983 
Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser 
1420 1425 1430 1435 

etc aee gag agg ctt tat gtt ggg gge cet ett ace aat tea agg ggg 17031 
Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly 
1440 1445 1450 

gag aac tgc ggc tat cgc agg tgc cgc gcg age gge gta ctg aca act 17079 
Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr 
1455 1460 1465 

age tgt ggt aac acc etc act tgc tac ate aag gee egg gca gee tgt 17127 
Ser Cys Gly Asn Thr Leu Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys 
1470 1475 1480 

cga gee gca ggg ete cag gac tgc aee atg etc gtg tgt gge gac gac 17175 
Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp 
1485 1490 1495 

tta gtc gtt ate tgt gaa age gcg ggg gtc cag gag gac gcg gcg age 17223 
Leu Val Val He Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser 
1500 1505 1510 1515 

ctg aga gcc ttc acg gag get atg acc agg tac tec gee cec cet ggg 17271 
Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly 
1520 1525 1530 

gac cec cca caa cca gaa tac gac ttg gag etc ata aca tea tgc tee 17319 
Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser Cys Ser 
1535 1540 1545 

tee aac gtg tea gtc gee cac gac gge get gga aag agg gtc tac tac 17367 
Ser Asn val Ser Val Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr 
1550 1555 1560 

etc acc cgt gac cet aca acc cec etc gcg aga get gcg tgg gag aca 17415 
Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr 
1565 1570 1575 

gca aga cac act cca gtc aat tec tgg eta gge aac ata ate atg ttt 17463 
Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He He Met Phe 
1580 1585 1590 1595 

gee cec aca ctg tgg gcg agg atg ata ctg atg aee eat ttc ttt age 17511 
Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His Phe Phe Ser 
1600 1605 1610 
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gtc ctt ata gcc agg gac cag ctt gaa cag gcc etc gat tgc gag ate 
Val Leu lie Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu lie 
1615 1620 1625 



17559 



tac ggg gee tgc tac tee ata gaa cca ctg gat eta cct cea ate att 
Tyr Gly Ala Cys Tyr Ser lie Glu Pro Leu Asp Leu Pro Pro lie lie 
1630 1635 1640 



17607 



caa aga etc cat ggc etc age gea ttt tea etc cae agt tac tet cca 
Gin Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro 
1645 1650 1655 



17655 



ggt gaa ate aat agg gtg gee gea tgc etc aga aaa ctt ggg gta ceg 
Gly Glu lie Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro 
1660 1665 1670 1675 



17703 



cec ttg ega get tgg aga cae egg gcc egg age gtc egc get agg ctt 
Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu 
1680 1685 1690 



17751 



ctg gee aga gga ggc agg get gee ata tgt ggc aag tac etc ttc aac 
Leu Ala Arg Gly Gly Arg Ala Ala lie Cys Gly Lys Tyr Leu Phe Asn 
1695 1700 1705 



17799 



tgg gea gta aga aca aag etc aaa etc act cca ata gcg gcc get ggc 
Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly 
1710 1715 1720 



17847 



cag ctg gac ttg tec ggc tgg ttc aeg get ggc tac age ggg gga gac 
Gin Leu Asp Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp 
1725 1730 1735 



17895 



att tat cac age gtg tct cat gcc egg cec egc tgg ate tgg ttt tgc 
ile Tyr His Ser Val Ser His Ala Arg Pro Arg Trp He Trp Phe Cys 
1740 1745 1750 1755 



17943 



eta etc ctg ctt get gea ggg gta ggc ate tac etc etc cec aac ega 
Leu Leu Leu Leu Ala Ala Gly Val Gly lie Tyr Leu Leu Pro Asn Arg 
1760 1765 1770 



17991 



atg age aeg aat cct aaa cct caa aga aag ace aaa egt aac ace aac 
Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1775 1780 1785 



18039 



egg egg ceg cag gac gtc aag ttc ceg ggt ggc ggt cag ate gtt ggt 
Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin Ile Val Gly 
1790 1795 1800 



18087 



gga gtt tac ttg ttg ceg egc agg ggc cct aga ttg ggt gtg egc gcg 
Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
1805 1810 1815 



18135 



aeg aga aag act tec gag egg teg caa cct ega ggt aga cgt cag cct 
Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
1820 1825 1830 1835 



18183 



ate cec aag get cgt egg cec gag ggc agg ace tgg get cag cec ggg 
lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
1840 1845 1850 



18231 
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tac cct tgg ccc etc tat ggc aat gag ggc tgc ggg tgg gcg gga tgg 18279 
Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
1855 1860 1865 

etc ctg tct ccc cgt ggc tct egg cct age tgg ggc ccc aca gae ccc 18327 
Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
1870 1875 1880 

egg cgt agg teg cge aat ttg ggt aag gte ate gat ace ett aeg tgc 18375 
Arg Arg Arg Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
1885 1890 1895 

ggc ttc gcc gae etc atg ggg tac ata ccg etc gte ggc gcc cct ett 18423 
Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu Val Gly Ala Pro Leu 
1900 1905 1910 1915 



gga ggc get gcc agg gcc taatagtcga ctttgttccc aetgtacttt 18471 
Gly Gly Ala Ala Arg Ala 
1920 



tagctegtac 


aaaatacaat 


atacttttca 


tttctccgta 


aacaacatgt 


tttcccatgt 


18531 


aatatccttt 


tetattttte 


gttecgttac 


caactttaca 


catactttat 


atagctattc 


18591 


acttctatac 


actaaaaaac 


taagacaatt 


ttaattttgc 


tgcctgccat 


atttcaattt 


18651 


gttataaatt 


cctataattt 


atcctattag 


tagctaaaaa 


aagatgaatg 


tgaatcgaat 


18711 


cctaagagaa 


ttggatctga 


tceacaggac 


gggtgtggtc 


gccatgatcg 


egtagtcgat 


18771 


agtggctcca 


agtagcgaag 


egagcaggae 


tgggcggcgg 


ccaaagcggt 


cggacagtgc 


18831 


tccgagaacg 


ggtgcgcata 


gaaattgcat 


caacgeatat 


agcgctagca 


gcacgccata 


18891 


gtgactggcg 


atgctgtcgg 


aatggacgat 


atcccgcaag 


aggcccggca 


gtaccggcat 


18951 


.aaccaagcct 


atgcctacag 


catccagggt 


gacggtgccg 


aggatgacga 


tgagegcatt 


19011 


gttagatttc 


atacaeggtg 


cctgactgcg 


ttagcaattt 


aactgtgata 


aactaccgea 


19071 


ttaaagcttt 


ttetttccaa 


tttttttttt 


ttegtcatta 


taaaaatcat 


tacgaccgag 


19131 


attcecgggt 


aataactgat 


ataattaaat 


tgaagctcta 


atttgtgagt 


ttagtataca 


19191 


tgcatttact 


tataatacag 


ttttttagtt 


ttgctggccg 


catcttctca 


aatatgcttc 


19251 


ccagcetget 


tttctgtaac 


gttcaccctc 


taccttagca 


teccttccct 


ttgcaaatag 


19311 


tcctcttcca 


acaataataa 


tgtcagatcc 


tgtagagace 


acatcatcca 


cggttctata 


19371 


ctgttgaccc 


aatgcgtctc 


ccttgtcatc 


taaacccaca 


ccgggtgtca 


taatcaacca 


19431 


atcgtaacct 


tcatctctte 


cacccatgtc 


tctttgagea 


ataaagcega 


taacaaaatc 


19491 


tttgtcgcte 


ttegeaatgt 


caaeagtaec 


ettagtatat 


tetccagtag 


atagggagce 


19551 


ettgcatgae 


aattetgcta 


acatcaaaag 


gectctaggt 


tcctttgtta 


ettcttctgc 


19611 


cgcctgcttc 


aaaccgctaa 


caatacctgg 


gcccaccaca 


ccgtgtgcat 


tcgtaatgtc 


19671 
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tgcccattct 


gctattctgt 


atacacccgc 


agagtactgc 


aatttgactg tattaccaat 


19731 


gtcagcaaat 


tttctgtctt 


cgaagagtaa 


aaaattgtac 


ttggcggata 


atgcctttag 


19791 


cggcttaact 


gtgccctcca 


tggaaaaatc 


agtcaagata 


tccacatgtg tttttagtaa 


19851 


acaaattttg 


ggacctaatg 


cttcaactaa 


ctccagtaat 


tccttggtgg 


tacgaacatc 


19911 


caatgaagca 


cacaagtttg 


tttgcttttc 


gtgcatgata 


ttaaatagct 


tggcagcaac 


19971 


aggactagga 


tgagtagcag 


cacgttcctt 


atatgtagct 


ttcgacatga 


tttatcttcg 


20031 


tttcctgcag 


gtttttgttc 


tgtgcagttg 


ggttaagaat 


actgggcaat 


ttcatgtttc 


20091 


ttcaacacta 


catatgcgta 


tatataccaa 


tctaagtctg 


tgctccttcc 


ttcgttcttc 


20151 


cttctgttcg 


gagattaccg 


aatcaaaaaa 


atttcaagga 


aaccgaaatc 


aaaaaaaaga 


20211 


ataaaaaaaa 


aatgatgaat 


tgaaaagctt 


atcgat 






20247 



<210> 19 
<211> 1921 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
pd. delta. NS3NS5 .pj .corelSO 

<400> 19 

Met Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro 
15 10 15 

Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His 
20 25 30 

Gly lie Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr Thr Gly 
35 40 45 

Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly 
50 55 60 

Cys Ser Gly Gly Ala Tyr Asp lie He He Cys Asp Glu Cys His Ser 
65 70 75 80 

Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala 
85 90 95 

Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro 
100 105 110 

Gly Ser Val Thr Val Pro His Pro Asn He Glu Glu Val Ala Leu Ser 
115 120 125 

Thr Thr Gly Glu He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val 
130 135 140 

He Lys Gly Gly Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys 
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145 

Asp Glu Leu Ala 



150 

Ala Lys 
165 



Tyr Tyr 



Val Val 



Asp Ser 
210 



Arg Gly 
180 



Leu Asp 
Thr Asp 
Val lie Asp Cys 



155 

Leu Val Ala Leu Gly lie 
170 

Val Ser Val He Pro Thr 
185 



Val Ala 
195 



Ala Leu 
200 

Asn Thr 
215 



Met Thr Gly Tyr 



Cys Val Thr Gin 
220 



Ser Leu Asp Pro 
225 

Ala Val Ser Arg 



Thr Phe Thr He Glu Thr He Thr 
230 235 



Thr Gin 
245 



Gly He 

Asp Ser 



Glu Leu 
290 



Tyr Arg 
260 

Ser Val 
275 



Arg Arg 
Phe Val Ala Pro 



Leu Cys 



Thr Pro Ala Glu 



Glu Cys 
280 

Thr Thr 
295 



Gly Arg Thr Gly 
250 

Gly Glu Arg Pro 
265 

Tyr Asp Ala Gly 



Val Arg Leu Arg 
300 



Thr Pro Gly Leu 

305 

Val Phe Thr Gly 



Pro Val Cys Gin Asp His Leu Glu 

310 315 

Leu Thr His He Asp Ala His Phe 
325 330 



Lys Gin 
Val Cys 



Lys Cys 
370 



Ser Gly 
340 

Ala Arg 
355 



Glu Asn Leu Pro 



Ala Gin 



Leu He Arg Leu 



Ala Pro 
360 

Lys Pro 
375 



Tyr Leu Val Ala 
345 

Pro Pro Ser Trp 



Thr Leu His Gly 
380 



160 

Asn Ala Val Ala 
175 

Ser Gly Asp Val 
190 

Thr Gly Asp Phe 
205 

Thr Val Asp Phe 



Leu Pro Gin Asp 
240 

Arg Gly Lys Pro 
255 

Ser Gly Met Phe 
270 

Cys Ala Trp Tyr 
285 

Ala Tyr Met Asn 



Phe Trp Glu Gly 
320 

Leu Ser Gin Thr 
335 

Tyr Gin Ala Thr 
350 

Asp Gin Met Trp 
365 

Pro Thr Pro Leu 



Leu Tyr Arg Leu 
385 

Val Thr Lys Tyr 



Gly Ala Val Gin Asn Glu He Thr 
390 395 



He Met 
405 



Thr Ser 



Tyr Cys 



Ser Gly 
450 



Thr Trp 
420 

Leu Ser 
435 



Thr Cys 
Val Leu Val Gly 



Thr Gly 



Cys Val 
440 



Met Ser Ala Asp 
410 

Gly Val Leu Ala 

425 

Val He Val Gly 



Lys Pro Ala He 



He Pro Asp Arg Glu Val 
455 460 



Leu Thr His Pro 
400 

Leu Glu Val Val 
415 

Ala Leu Ala Ala 
430 

Arg val Val Leu 

445 

Leu Tyr Arg Glu 
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Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin 
465 470 475 480 

Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu 
485 490 495 

Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala Val Gin Thr 
500 505 510 

Asn Trp Gin Lys Leu Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe 
515 520 525 

He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 
530 535 540 

Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro 

545 550 555 560 

Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val 
565 570 575 

Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 
580 585 590 

Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly Leu Gly Lys Val Leu 
595 600 605 

He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 
610 615 620 

Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 
625 630 635 640 

Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val 
645 650 655 

Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
660 665 670 

Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 
675 680 685 

val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
690 695 700 

Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu 
705 710 715 720 

His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp 
725 730 735 

Leu Arg Asp He Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys 
740 745 750 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe 
755 760 765 



Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp Arg Gly Asp Gly He 
770 775 780 
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Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Qly His Val Lys 
785 790 795 800 

Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
805 810 815 

Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
820 825 830 

Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp Arg Val Ser Ala Glu 
835 840 845 

Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe His Tyr Val Thr Gly 
850 855 860 

Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin Val Pro Ser Pro Glu 
865 870 875 880 

Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro 
885 890 895 

Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His 
900 905 910 

Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val 
915 920 925 

Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu 
930 935 940 

Ala Ala Gly Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Val Ala Ser 
945 950 955 960 

Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr 
965 970 975 

Ala Asn His Asp Ser Pro Asp Ala Glu Leu He Glu Ala Asn Leu Leu 
980 985 990 

Trp Arg Gin Glu Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn 
995 1000 1005 

Lys Val Val He Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp 
1010 1015 1020 

Glu Arg Glu He Ser Val Pro Ala Glu He Leu Arg Lys Ser Arg Arg 
025 1030 1035 1040 

Phe Ala Gin Ala Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro 
1045 1050 1055 

Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His 
1060 1065 1070 

Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val Pro Pro Pro Arg 
1075 1080 1085 

Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu 
1090 1095 1100 



173 



wo 01/38360 



PCT/USOO/32326 



Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser Thr Ser Gly He 
105 1110 1115 1120 

Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys 
1125 1130 1135 

Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu 
1140 1145 1150 

Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val 
1155 1160 1165 

Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys Ser Net Ser Tyr 
1170 1175 1180 

Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys 
185 1190 1195 1200 

Leu Pro He Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu 
1205 1210 1215 

Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val 
1220 1225 1230 

Thr Phe Asp Arg Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu 
1235 1240 1245 

Lys Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser 
1250 1255 1260 

Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys 
265 1270 1275 1280 

Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val 
1285 1290 1295 

Thr His He Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr 
1300 1305 1310 

Pro He Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin 
1315 1320 1325 

Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp 
1330 1335 1340 

Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr 
345 1350 1355 1360 

Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser 
1365 1370 1375 

Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys 
1380 1385 1390 

Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val 
1395 1400 1405 

Thr Glu Ser Asp He Arg Thr Glu Glu Ala He Tyr Gin Cys Cys Asp 
1410 1415 1420 
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Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser Leu Thr Glu Arg Leu 
425 1430 1435 1440 

Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr 
1445 1450 1455 

Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr 
1460 1465 1470 

Leu Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu 
1475 1480 1485 

Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp Leu Val Val lie Cys 
1490 1495 1500 

Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr 
505 1510 1515 1520 

Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro 
1525 1530 1535 

Glu Tyr Asp Leu Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val 
1540 1545 1550 

Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro 
1555 1560 1565 

Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro 
1570 1575 1580 

Val Asn Ser Trp Leu Gly Asn He He Met Phe Ala Pro Thr Leu Trp 
585 1590 1595 1600 

Ala Arg Met He Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg 
1605 1610 1615 

Asp Gin Leu Glu Gin Ala Leu Asp Cys Glu He Tyr Gly Ala Cys Tyr 
1620 1625 1630 

Ser He Glu Pro Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly 
1635 1640 1645 

Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg 
1650 1655 1660 

Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp 
665 1670 1675 1680 

Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly 
1685 1690 1695 

Arg Ala Ala He Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr 
1700 1705 1710 

Lys Leu Lys Leu Thr Pro He Ala Ala Ala Gly Gin Leu Asp Leu Ser 
1715 1720 1725 

Gly Trp Phe Thr Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val 
1730 1735 1740 
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Ser His Ala Arg Pro Arg Trp lie Trp Phe Cys Leu Leu Leu Leu Ala 
745 1750 1755 1760 

Ala Gly Val Gly lie Tyr Leu Leu Pro Asn Arg Met Ser Thr Asn Pro 
1765 1770 1775 

Lys Pro Gin Arg Lye Thr Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp 
1780 1785 1790 

Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly Gly Val Tyr Leu Leu 
1795 1800 1805 

Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser 
1810 1815 1820 

Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro lie Pro Lys Ala Arg 
825 1830 1835 1840 

Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly Tyr Pro Trp Pro Leu 
1845 1850 1855 

Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg 
1660 1865 1870 

Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg 
1875 1880 1885 

Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 
1890 1895 1900 

Met Gly Tyr He Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 
905 1910 1915 1920 

Ala 
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